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Abstract 

In order to analyse universal patterns in the large space-time behaviour of interact- 
ing multi-type stochastic populations on countable geographic spaces, a key approach has 
been to carry out a renormalisation analysis in the hierarchical mean-field limit. This has 
provided considerable insight into the structure of interacting systems of finite-dimensional 
diffusions, such as Fisher- Wright or Feller diffusions, and their infinite-dimensional ana- 
logues, such as Fleming- Viot or Dawson- Wat anabe superdiffusions. 

The present paper brings a new class of interacting jump processes into focus. We 
start from a single-colony C A -process, which arises as the continuum-mass limit of a 
A-Cannings individual-based population model, where A is a finite non-negative measure 
that describes the offspring mechanism, i.e., how individuals in a single colony are replaced 
via resampling. The key feature of the A-Cannings individual-based population model is 
that the offspring of a single individual can be a positive fraction of the total population. 
After that we introduce a system of hierarchically interacting C A -processes, where the 
interaction comes from migration and rcshuffling-resampling on all hierarchical space- 
time scales simultaneously. More precisely, individuals live in colonics labelled by the 
hierarchical group Q N of order N, and are subject to migration based on a sequence of 
migration coefficients c = (ck)ken a and to reshuffling-resampling based on a sequence of 
resampling measures A = (Ak)k&io, both acting in fc-blocks for all k e No- The reshuffling 
is linked to the resampling: before resampling in a block takes place all individuals in that 
block are relocated uniformly, i.e., resampling is done in a locally "panmictic" manner. 
We refer to this system as the (^—-process. The dual process of the C A -process is the 
A-coalescent, whereas the dual process of the C^--process is a spatial coalescent with 
multi-level block coalescence. 

For the above system we carry out a full renormalisation analysis in the hierarchical 
mean- field limit N — > oo. Our main result is that, in the limit as N — > oo, on each 
hierarchical scale k e No the fc-block averages of the Cf^-process converge to a random 
process that is a superposition of a C Ak -process and a Fleming- Viot process, the latter with 
a volatility dk and with a drift of strength Ck towards the limiting (k + l)-block average. 
It turns out that dk is a function of c; and A/ for all < I < k. Thus, it is through 
the volatility that the renormalisation manifests itself. We investigate how dk scales as 
k — > oo, which requires an analysis of compositions of certain Mobius-transformations, 
and leads to four different regimes. 
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We discuss the implications of the scaling of dk for the behaviour on large space- 
time scales of the C^~-process. We compare the outcome with what is known from the 
renormalisation analysis of hierarchically interacting Fleming- Viot diffusions, pointing out 
several new features. In particular, we obtain a new classification for when the process 
exhibits clustering (= develops spatially expanding mono- type regions), respectively, ex- 
hibits local coexistence (= allows for different types to live next to each other with positive 
probability). Here, the simple dichotomy of recurrent versus transient migration for hi- 
erarchically interacting Fleming- Viot diffusions, namely, 2~Z/ceN (V c fc) = 00 versus < 00, 
is replaced by a dichotomy that expresses a trade-off between migration and reshuffling- 
resampling, namely, 2~ZfeeN (V c fc) Ef=o ^'([Oj 1]) = 00 versus < 00. Thus, while recurrent 
migrations still only give rise to clustering, there now are transient migrations that do the 
same when the block resampling is strong enough, namely, 2~^zgN Aj([0, 1]) = 00. More- 
over, in the clustering regime we find a richer scenario for the cluster formation than for 
Fleming- Viot diffusions. In the local-coexistence regime, on the other hand, we find that 
the types initially present only survive with a positive probability, not with probability 
one as for Fleming- Viot diffusions. Finally, we show that for finite N the same dichotomy 
between clustering and local coexistence holds as for N — > 00, even though we lack proper 
control on the cluster formation, respectively, on the distribution of the types that survive. 
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1 Introduction and main results 
1.1 Outline 



Section 1.2 provides the background for the paper. Section 1.3 defines the single-colony and 
the multi-colony C A -process, as well as the so-called McKean-Vlasov C A -process, a single- 
colony C A -process with immigration and emigration from and to a cemetery state arising in 
the con text of the scaling limit of the multi-colony C A -process with mean-field interaction. 



Section 



1.4 



,c,A 



defines a new process, the C^- _ -process, where the countably many colonies are 
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labelled by the hierarchical group fl^ of order N, and the migration and the reshuffling- 
resampling on successive hierarchical space-time scales are governed by a sequence c = (ck)keN 



of migration coefficients and a sequence A = (Ak)kef$ of resampling measures. Section 1.5 



introduces multiple space-time scales and a collection of renormalised systems. It is shown 

c A. 

that, in the hierarchical mean-field limit N — > oo, the block averages of the C^r -process on 
hierarchical space-time scale k converge to a McKean-Vlasov process that is a superposition 
of a single-colony C Afc -process and a single-colony Fleming- Viot process with a volatility df~ 
that is a function of q and A; for all < I < k, and a drift of strength c& towards the 
limiting (k + l)-st block average. The scaling of dk as k — > oo turns out to have several 
universality classes. Section 1.5.5 discusses the implications of this scaling for the behaviour 
of the C%~ -process on large space-time scales, and compares the outcome with what is known 
for hierarchically interacting Fleming- Viot diffusions. 

A key feature of the Cj^~ -process is that it has a spatial A-coalescent with block migration 
and block coalescence as a dual process. This duality, which is of intrinsic interest, and the 
properties of the dual process are worked out in Section [2] The proofs of the main theorems 
are given in Sections [3 11 To help the reader, a list of the main symbols used in the paper 
is added in Section 



1.2 Background 

1.2.1 Population dynamics 

For the description of spatial populations subject to migration and to neutral stochastic 
evolution (i.e., resampling without selection, mutation or recombination), it is common to use 
variants of interacting Fleming- Viot diffusions (Dawson [D93J, Donnelly and Kurtz [DK99J, 
Etheridge [E001 lEllj ). These are processes taking values in V(E) 1 , where I is a countable 
Abelian group playing the role of a geographic space labelling the colonies of the population 
(e.g. Z d , the (i-dimensional integer lattice, or CIn, the hierarchical group of order N), E is 
a compact Polish space playing the role of a type space encoding the possible types of the 
individuals living in these colonies (e.g. [0, 1]), and V{E) is the set of probability measures on 
E. An element in V(E) 1 specifies the frequencies of the types in each of the colonies in I. 

Let us first consider the (locally finite) populations of individuals from which the above 
processes arise as continuum-mass limits. Assume that the individuals migrate between the 
colonies according to independent continuous-time random walks on /. Inside each colony, 
the evolution is driven by a change of generation called resampling. Resampling, in its sim- 
plest form (Moran model), means that after exponential waiting times a pair of individuals 
( "the parents" ) is replaced by a new pair of individuals ( "the children" ) , who randomly and 
independently adopt the type of one of the parents. The process of type frequencies in each 
of the colonies as a result of the migration and the resampling is a jump process taking values 
in ViE) 1 . 

If we pass to the continuum mass limit of the frequencies by letting the number of individ- 
uals per colony tend to infinity, then we obtain a system of interacting Fleming- Viot diffusions 
(Dawson, Greven and Vaillancourt |DGV 95j). By picking different resampling mechanisms, 
occurring at a rate that depends on the state of the colony, we obtain variants of interacting 
Fleming- Viot diffusions with a state-dependent resampling rate |DM95j . In this context, key 
questions are: To what extent does the behaviour on large space-time scales depend on the 
precise form of the resampling mechanism? In particular, to what extent is this behaviour uni- 
versal? For Fleming- Viot models and a small class of state- and type-dependent Fleming- Viot 
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models this question has been answered in |DGV 95j. 

If we consider resampling mechanisms where, instead of a pair of individuals, a positive 
fraction of the local population is replaced (an idea due to Cannings |C74| IC75| ). then we 
enter the world of jump processes. In this paper, we will focus on jump processes that are 
parametrised by a measure A on [0, 1] that models the random proportion of offspring in 
the population generated by a single individual in a resampling event. It has been argued 
by many authors that such jump processes are suitable for describing situations with lit- 
tle biodiversity. For instance, the jumps may account for selective sweeps, or for extreme 
reproduction events (occurring on smaller time scales and in a random manner so that an 
effectively neutral evolution results), such as those observed in certain marine organisms, e.g. 
Atlantic cod or Pacific oyster (Eldon and Wakeley |EW06 ] ) . It is argued in Der, Epstein 
and Plotkin [P EP 11] that mixtures of diffusive dynamics and Cannings dynamics provide a 
better fit for generation-by-generation empirical data from Drosophila populations. Birkner 
and Blath [BB08, BB09-MVD] treat the issue of statistical inference on the genealogies cor- 
responding to a one-parameter family of Cannings dynamics. 

Our goal is to describe the effect of jumps in a spatial setting with a volatile reproduction. 
To that end we study a system of hierarchically interacting Cannings processes. The interac- 
tion is chosen in such a way that the hierarchical lattice mimics the two-dimensional Euclidean 
space, as will become clear later on. On top of migration and single-colony resampling, we add 
multi-colony resampling by carrying out a Cannings-type resampling in all blocks simultane- 
ously, combined with a reshuffling of the individuals inside the block before the resampling 
is done. The reshuffling mimics the fact that in reproduction the local geographic interaction 
typically takes place on a smaller time scale, in a random manner, and effectively results in a 
Cannings jump during a single observation time. We will see that in our model the reshuffling 
allows us to define the process for all A, and simplifies the analysis by avoiding the need to 
compensate for small jumps. 

The idea to give reproduction a non-local geographic structure, in particular, in two di- 
mensions, was exploited in Barton, Etheridge and Veber |BEV10] and Berestycki, Etheridge 
and Veber |BEV1 also. There, the process on the torus of sidelength L is constructed via its 
dual, and it is shown that a limiting process on M 2 exists as L — > oo. In [BEVlOl IBEV| it is 
assumed that the individual lineages are compound Poisson processes. Freeman [FY] considers 
a particular case of the spatially structured Cannings model with a continuum self-similar 
geographic space, where all individuals in a block are updated upon resampling. This setup 
does not require compensation for small jumps and allows for their accumulation. 

1.2.2 Renormalisation 

A key approach to understand universality in the behaviour of interacting systems has been 
a renormalisation analysis of block averages on successive space-time scales combined with 
a hierarchical mean-field limit. In this setting, one replaces / by the hierarchical group Qat 
of order N and passes to the limit N — > oo ("the hierarchical mean- field limit") [] With 
the limiting dynamics obtained through the hierarchical mean-field limit one associates a 
(nonlinear) renormalisation transformation J- c (which depends on the migration rate c), acting 
on the resampling rate function g driving the diffusion in single colonies. One studies the orbit 
(J 7 ^(<7))fc e N ) with J 7 ^! = F Ck _ 1 o • • • o J 7 ^, characterising the behaviour of the system on an 



1 Actually, this set-up provides an approximation for the geographic space I — 1? , on which simple random 
walk migration is critically recurrent (Dawson, Gorostiza and Wakolbinger [DGW]). We will comment on this 
issue in Section|1.4.2 
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increasing sequence of space-time scales, where (ck)keN represents the sequence of migration 
coefficients, with the index k labelling the hierarchical distance. The universality classes of the 
system are associated with the fixed points (or the fixed shapes) of J- c , i.e., g with J- C {g) = ag 
with a = 1 (or a = a(c) G (0, oo)). 

The above renormalisation program was developed for various choices of the single-colony 
state space. Each such choice gives rise to a different universality class with specific features 
for the large space-time behaviour. For the stochastic part of the renormalisation program 
(i.e., the derivation of the limiting renormalised dynamics), see Dawson and Greven [DG93a], 
|DG93bj . |DG93cj . |DG96j . |DG99j . |DG03j . Dawson, Greven and Vaillancourt |DGV95j . and 
Cox, Dawson and Greven |CDG04] . For the analytic part (i.e., the study of the renormali- 
sation map J-), see Baillon, Clement, Greven and den Hollander [BCGH95J, [BCGH97], den 
Hollander and Swart I ISDN . and Dawson, Greven, den Hollander, Sun and Swart [DGHSS08]. 

So far, two important classes of single-colony processes could not be treated: Anderson 
diffusions |GH07| and jump processes. In the present paper, we focus on the second class, in 
particular, on so-called C A -processes. In all previously treated models, the renormalisation 
transformation was a map T c acting on the set M{E) of measurable functions on E, the 
single-component state space, while the function g was a branching rate, a resampling rate or 
other, defining a diffusion function x h-> xg(x) on [0, oo) ori4 x(l — x)g(x) on [0, 1], etc. In 
the present paper, however, we deal with jump processes that are characterised by a sequence 
of finite measures A = (A/JfceNo on [0> l]i an d we obtain a renormalisation map T c acting on a 
pair (g, A), where g £ M{E) characterises diffusive behaviour and A characterises resampling 
behaviour. It turns out that the orbit of this map is of the form 

(1.1) (d k g*, (Aj) 

where g* = 1 and d k depends on d}--i, Ck-i and the total mass of A^_i. Here, as before, 
c = (ck)keN is the sequence of migration coefficients. The reason behind this reduction is that 
our single-colony process is a superposition of a C A -process and a Fleming- Viot process with 
state-independent resampling rates and that both these processes renormalise to a multiple of 
the latter. It turns out that dk can be expressed in terms of compositions of certain Mobius- 
trans formations with parameters changing from composition to composition. It is through 
these compositions that the renormalisation manifests itself. 

If the single-colony process would be a superposition of a C A -process and a Fleming- Viot 
process with state- dependent resampling rate, i.e., g would not be a constant but a function 
of the state, then the renormalisation transformation would be much more complicated. It 
remains a challenge to deal with this generalisation. 



1.3 The Cannings model 



The A-Cannings model involves a finite non-negative measure A £ Mf([0, 1]). We focus on 
the special case with 

(1.2) A({0}) = 

satisfying the so-called dust-free condition 
f A(dr) 
'[o,i] r 



(1.3) 



oo. 



Condition (1.2) excludes the well-studied case of interacting Fleming- Viot diffusions, i.e., we 
focus on the jumps in the A-Cannings model. Condition (1.3) excludes cases where the jump 
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sizes do not accumulate. Moreover, this condition is needed to have well-defined proportions 
of the different types in the population in the infinite-population limit (Pitman |P99j ). and 
also to be able to define a genealogical tree for the population (Greven, Pfaffelhuber and 
Winter |GPW09] )[[| 

In Sections |1.3.1f[L3^3 we build up the Cannings model in three steps: single-colony 



-process, multi-colony C A -process, and C A -process with immigration-emigration (McKean- 
Vlasov limit). 

1.3.1 Single-colony C A -process 

We recall the definition of the A-Cannings model in its simplest form. This model describes 
the evolution of allelic types of finitely many individuals living in a single colony. Let MgN 
be the number of individuals, and let £ be a compact Polish space encoding the types (a 
typical choice is E = [0, 1]). The evolution of the population, whose state space is E M , is as 
follows. 

• The number of individuals stays fixed at M during the evolution. 

• Initially, i.i.d. types are assigned to the individuals according to a given distribution 
(1.4) 6 G V{E). 

• Let A* 6 -M([0, 1]) be the a- finite measure defined as 
A(dr) 



(1.5) A*(dr) 



r 



2 



Consider an inhomogeneous Poisson point process on [0, oo) x [0, 1] with intensity mea- 
sure 

(1.6) d£®A*(dr). 

For each point (t, r) in this process, we carry out the following transition at time t. Mark 
each of the M individuals independently with a 1 or with probability r, respectively, 
1 — r. All individuals marked by a 1 are killed and are replaced by copies of a single 
individual (= "parent") that is randomly chosen among all the individuals marked by a 
1 (see Fig.[l]). 

In this way, we obtain a pure-jump Markov process, which is called the A-Cannings model 
with measure A and population size M. 

Note that, for a jump to occur, at least two individuals marked by a 1 are needed. Hence, 
for finite M, the rate at which some pair of individuals is marked is 



(1.7) / ^^\M{M - 1) r 2 = iM(M-l)A([0, 1]) < oo, 



'[o,i] 

and so only finitely many jumps occur in any finite time interval. 

By observing the frequencies of the types, i.e., the number of individuals with a given 
type divided by M, we obtain a measure-valued pure-jump Markov process on V{E). Letting 



Condition (1.2 l is relevant for some of the questions addressed in this paper, though not for all. We 
comment on this issue as we go along. Another line of research would be to work with the most general Cannings 
models that allow for simultaneous multiple resampling events. We do not pursue such a generalisation here. 
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Figure 1: Cannings resampling event in a colony of M = 8 individuals of two types. Arrows 
indicate type inheritance, X indicates death. 



M — > oo, we obtain a limiting process X = (X(t)) t >o, called the C A -process, which is a 
strong Markov jump process with paths in D([0, oo),V(E)) (the set of cadlag paths in V{E) 
endowed with the Skorokhod Ji-topology) and can be characterised as the solution of a well- 
posed martingale problem (Donnelly and Kurtz [DK99J). This process has countably many 
jumps in any finite time interval if A((0, 1]) > and is the Fleming- Viot diffusion if A = 5q. 
The latter corresponds to Moran resampling. 

1.3.2 Multi-colony C A -process: mean-field version 

Next, we consider the spatial A-Cannings model in its standard mean- field version. Consider 
as geographic space a block of sites {0, . . . , N — 1} and assign M individuals to each site 
(= colony). The evolution of the population, whose state space is (E M ) N , is defined as the 
following pure-jump Markov process. 

• The total number of individuals stays fixed at NM during the evolution. 

• At the start, each individual is assigned a type that is drawn from E according to some 
prescribed exchangeable law. 

• Individuals migrate between colonies at rate c > 0, jumping according to the uniform 
distribution on {0, . . . , iV — 1} (see Fig. [2]). 

• Individuals resample within each colony according to the A-Cannings model with pop- 
ulation size corresponding to the current size of the colony. 

By considering the frequencies of the types in each of the colonies, we obtain a pure-jump 
Markov process taking values in V(E) N . 

Letting M — > oo, we pass to the continuum mass limit and we obtain a system of iV 
interacting C A -processes, denoted by 

(1.8) XW = {x( N \t)) t > with *W(t) = {^(t)}^ 1 G V{E) N . 

The process can be characterised as the solution of a well-posed martingale problem on 

D([0, oc),V(E) N ) with the product topology on V(E) N . To this end, we have to consider an 
algebra T C C h {V(E) N ,R) of test functions, and a linear operator LW on C h (V(E) N , M) 
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colony 2 



E = {;•}. 



colony 1 
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i 

* 



colony 3 



colony 4 

Figure 2: Possible migration paths between N = 4 colonies with M = 3 individuals of two types in 
the mean-field version. 



with domain J-, playing the role of the generator in the martingale problem. Here, we let T 
be the algebra of functions F of the form 



X 



Fix) 



E n 



, (du m ) )<p( U \...,u n ), x = (x , xjv-i) 6 7>(£7) 



A? 



vm=l 



n e N, </? e 0,(^,11), ii,...,i n e {0,...,JV-1}. 
The generator 

(1.10) #>: ^^C b (V(E) N ,R) 
has two parts, 

(1.11) LW = L w + ijm. 

The migration operator is given by 

■ v_l " aF(x) 

where 



iV-1 „ 

(1.12) (4^)W = ^E / (^'-^)(d«) 



<9x,; 



dF(x) 1 r 

(1.13) ^ [<5 a ] = lim - F(x , . . . ,x;_i,Xj + h5 a ,x i+1 , . . . ,x N -i) - F(x) 

is the Gateaux-derivative of F with respect to Xj in the direction <5 a (this definition requires 



that in (1.9) we extend V{E) to the set of finite signed measure on E). Note that the total 



derivative in the direction v 6 V(E) is the integral over v of the expression in (1.13), since 
V(E) is a Choquet simplex and F is continuously differentiable. 

The resampling operator is given by (cf. the description of the single-colony C A -process in 



Section 1.3.1 ) 



N-l 

(L^F)(x) = Y J / A*(dr) / x t (da) 

(i.i4) i=0 J m j e 



x [f(xq, . . . , Xi-i, (1 - r)xi + r<5 a , Xj+i, . . . , x^r-i) - F(x) 



Note that, by the law of large numbers, in the limit M — > oo the evolution in ( 1.4 - 1.6 ) results 



in the transition x — >• (1 — r)x + r5 a with type a drawn from distribution x. This gives rise to 

Oil). 
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Proposition 1.1. [Multi-colony martingale problem] 

Without assumption ( | 1 . 3| ) ^ for every x G V(E) N , the martingale problem for {L^ N \F,5 X ) is 
well-posed. The unique solution is a strong Markov process with the Feller property. 

1.3.3 C A -process with immigration-emigration: McKean-Vlasov limit 



The N — > oo limit of the iV-colony model defined in Section [1.3.2 can be described in terms of 



an independent and identically distributed family of V(E) -valued processes indexed by N. Let 
us describe the distribution of single member of this family, which can be viewed as a spatial 



variant of the model in Section 1.3.1 when we add immigration-emigration to/from a cemetery 
state, with the immigration given by a source that is constant in time. Such processes are of 
interest in their own right. They are referred to as McKean-Vlasov processes for (c, d, A, 9), 
c,d G (0,oo), A G A4f(E), 6 G V(E), or C A -processes with immigration-emigration at rate c 
with source 9 and volatility constant d. 

Let J- C C^V (E) ,M) be the algebra of functions F of the form 

(1.15) F(x) = [ x® n (du)ip(u), x£V(E),n£N,^£C h (E n ,R). 

JE n 



For c,d G (0,oo), A G M f ([0,l]) subject to jT^joj) and 9 G V(E), let L c e ' d ' A : T 
Ch(V(E),M.) be the linear operator 

(1.16) L c / A = L% + L d + L A 

acting on F G T as 



{L%F)[x) = c J JO - x) (da)^M[U 
(1.17) {L d F){x) = df [ Q x (du,dv)^^-[6 U ,5 V ] 



E JE 



{L A F)(x)= [ A*(dr) f x(da) [F((l - r)x + r5 a ) - F(x)] . 

J\0,1] JE 



'[0 

where 

(1.18) Q x (du, dv) = x(du) 5 u (dv) — x(du) x(dv ) 

is the Fleming- Viot diffusion function. The three parts of L c ^ d)A correspond to: a drift towards 
9 of strength c (immigration-emigration), a Fleming- Viot diffusion with volatility d (Moran 
resampling), and a C -process with resampling measure A (Cannings resampling). This model 
arises as the M — > oo limit of an individual-based model with M individuals at a single site 
with immigration from a constant source with type distribution 9 G V(E) and emigration to 
a cemetery state, both at rate c, in addition to the A-resampling. 



Proposition 1.2. [McKean-Vlasov martingale problem] 



Without assumption (1.3), for every x G V(E), the martingale problem for (Lg ,d,A , J 7 , 5 X ) is 
well-posed. The unique solution is a strong Markov process with the Feller property. 

Denote by 

(1.19) Zf' A = (Z c / A (t)) t>0 , Z C /' A (0) = 9, 



the solution of the martingale problem in Proposition 1.2 for the special choice x = 9. This is 



called the McKean-Vlasov process with parameters c,d,A and initial state 9. 
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1.4 The hierarchical Cannings process 



The model described in Section 1.3.2 has a finite geographical space, an interaction that is 
mean-field, and a resampling of individuals at the same site. In this section, we introduce two 
new features into the model: 



(1) We consider a countably infinite geographic space, namely, the hierarchical group $7jv of 
order N, with a migration mechanism that is block-wise exchangeable. 

(2) We allow resampling between individuals not only at the same site but also in blocks 
around a site, which we view as macro- colonies. 



Both the migration rates and the resampling rates for macro-colonies decay as the distance 



between the macro-colonies grows. Feature (1) is introduced in Sections 1.4.1-1.4.2. feature 



(2) in Section 1.4.3 The hierarchical model is defined in Section 1.4.4 



1.4.1 Hierarchical group of order N 

The hierarchical group £In of order N is the set 

(1.20) n N = [ v = (n% No g {o, 1, . . . , n - i} N ° n g n\{i}, 

ieN 

endowed with the addition operation + defined by (77 + = rf + ( l (mod N), I G No- In 
other words, £1^ is the direct sum of the cyclical group of order N, a fact that is important for 
the application of Fourier analysis. The group is equipped with the ultrametric distance 
d(-, •) defined by 

(1.21) d(rj, C) = d(0, r]-() = min{fc G N : rf = ( l , for all I > k}, £l N . 
Let 

(1.22) B k {rf) = {C G n N : d( V , () < k}, n G Q N , k G N , 

denote the fe-block around 77, which we think of as a macro-colony. The geometry of f2jv is 
explained in Fig. [3]). 




2-block 



3-block 

Figure 3: Close-ups of a 1-block, a 2-block and a 3-block in the hierarchical group of order N = 3. The 
elements of the group are the leaves of the tree (□). The hierarchical distance between two elements 
is the graph distance to the most recent common ancestor: rf) = 2 for £ and r\ in the picture. 
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We construct a process 
(1.23) X {nN) = (X (Qjv) (t)) i>Q with X (nN \t) = {X^{t)} G V(E) Qn , 



by using the same evolution mechanism as for the multi-colony system in Section 1.3.2, except 
that we replace the migration on {0, . . . , iV — 1} by a migration on f2jy, and the resampling 
acting in each colony by a resampling in each of the macro-colonies. On V{E) N , we again 
choose the product of the weak topology on V(E) as the basic topology. 

1.4.2 Block migration 

We introduce migration on Ojv through a random walk kernel. For that purpose, we introduce 
a sequence of migration rates 

(1.24) c=(c fc ) fceNo € (0,oo) N <\ 

and we let the individuals migrate as follows: 

• Each individual, for every k £ N, chooses at rate c&_i/2V* —1 the block of radius k around 
its present location and jumps to a location uniformly chosen at random in that block. 



The transition rates of the random walk that is thus performed by each individual are 

AT2fc-i ' 



J1.25) aW fa, C)= E life' ^C€n*,7^C, « w (m) = o. 



k>d( v ,Q 



As shown in Dawson, Gorostiza and Wakolbinger |DGW05] . this random walk is recurrent if 
and only if X)fceNo(V c fc) = 00 • F° r the special case where = c k , it is strongly recurrent for 
c < 1, critically recurrent for c = 1, and transient for c > 1 [^J 
Throughout the paper, we assume thatQ 



(1.26) limsup \ logc/c < oo. 

k— >co 



This guarantees that the total migration rate per individual is bounded. 
1.4.3 Block reshuffling-resampling 



As we saw in Section |1.3[ the idea of the Cannings model is to allow reproduction with an 
offspring that is of a size comparable to the whole population. Since we have introduced a 
spatial structure, we now allow, on all hierarchical levels k simultaneously, a reproduction 
event where each individual treats the A:-block around its present location as a macro-colony 
and uses it for its resampling. More precisely, we choose a sequence of finite non-negative 
resampling measures 

(1.27) A=(A fc ) teNo 6M/([0,l]) N », 

Loosely speaking, the behaviour is like that of simple random walk on Z d with d < 2, d = 2 and d > 2, 
respectively. More precisely, with the help of potential theory it is possible to associate with the random walk 
a dimension as a function of c and N that for N — > oo converges to 2. This shows that in the limit as iV — > oo, 
the potential theon 
random walk on Z 2 



the potential theory of the hierarchical random walk given by ( 1.25 I choosing c = 1 is similar to that of simple 

?2 



4t 



In Section 1.6 we will analyse the case N < oo, where ( 1.26 ) must be replaced by limsup fc _ >00 i log < N 
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each subject to (1.2). Assume in addition that 
Afc(dr) < oo, k G N, 



(1.28) f 
Jo 



and that Ao satisfies (1.3). Set 
(1.29) A fc = A*([0,l]), fee N . 

We let individuals reshuffle-res ample by carrying out the following two steps at once (the 



formal definition requires the use of a suitable Poisson point process, cf., (2.26), (1.5) and 



(1.6)): 



• For every r] G Otv and k G No, choose the block Bk(rj) at rate 1/N 2k . 

• Each individual in Bk(rf) is first moved to a uniformly chosen random location in B)~{rj), 
i.e., a reshuffling takes place (see Fig. [4]). After that, r is drawn according to the intensity 
measure At (recall (1.5)), and with probability r each of the individuals in Bk(rj) is 
replaced by an individual of type a, with a drawn according to the type distribution in 
Bkiv), i-e., 

(1.30) y v>k = N~ k x <- 

C6B fe (r?) 

Note that the reshuffling-resampling affects all the individuals in a macro-colony simultane- 
ously and in the same manner. The reshuffling-resampling occurs at all levels k G No, at a rate 
that is fastest in single colonies and gets slower as the level k of the macro-colony increases. 



colony 1 




colony 2 






colony 3 


! 


f 1 f 




• 


P 


- 


L ^..T?.....?? 

N 


— f— — i 

/ "W- 7 " 




.a*™ iv...^....^ 

■" " \ * 

~> -. \ / 
~ ' ~~ 


N 

^ N V >' 










• 


•• l 


colony 1 




colony 2 






colony 3 



before reshuffling 

> 1-1 .lock 
after reshuffling 



Figure 4: Random reshuffling in a 1-block on the hierarchical lattice of order N = 3 with M = 3 
individuals of two types per colony. 

Throughout the paper, we assume that A = (Xk)keN Q satisfies^] 
(1.31) limsup| logAfc < oo. 

k— >oo 

Note that each of the N k colonies in a &-block can trigger reshuffling-resampling in that block, 
and for each colony the block is chosen at rate N~ 2k . Therefore (1.31) guarantees that the 
total resampling rate per individual is bounded. 



5 Because the reshuffling is done first, the resampling always acts on a uniformly distributed state ("panmictic 
resampling" 



In Section 1.6 we will analyse the case N < oo, where ( 1.31 1 must be replaced by limsupj,^^ jr log < N 
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We note that in the continuum mass limit the reshuffling-resampling operation takes the 
following form when it acts on the states in the colonies: 

(1.32) is replaced by (1 — r)y Vt k + r $a for ah C £ Bkiv) 

with a £ E drawn from y„ Note that in the mean- field case and in the single-colony case, 
a G E is drawn from (cf. ( 1.14[ )) Q 

1.4.4 Hierarchical Cannings process 

We are now ready to formally define our system of hierarchically interacting C A -processes in 
terms of a martingale problem. This is the continuum-mass limit (M — > oo) of the individual- 
based model that we described in Sections ll,4.1rfl~4.3l Recall that so far we have considered 
block migration and block reshuffling-resampling on the hierarchical group of fixed order N, 
starting with M individuals at each site. 

We equip the set V(E) nN with the product topology to get a state space that is Polish. 
Let T C C h (V(E) nN ,R) be the algebra of funct ions of the form 



(1.33) 



F(x) = ((g) x Vm (du m ) j <p(u\ ...,<), x = {x v )r, eQN e V(Ef\ 

n£N, (peC h (E n ,R), r/i,...,7? m € 
The linear operator for the martingale problem 

(1.34) L (f7jv) : T -»• C h (V(E) nN ,R) 
again has two parts, 

(1.35) = + 

The migration operator is given by 

(1.36) (L^F)(x)= £ aWfoC) f (x € - x v )(da)^-[6 a ] 

v,Cen N Je ax « 

and the reshuffling-resampling operator by 

(1.37) (L^F)(x) = £ Yl ^ I A ^ dr ) / yvM^a) [F (^ aAM (x)) - F( 3 



rjeHjv fceNo 



where & r ,a,B k (r]) '■ V{E)^ 1n — > V(E) nN is the reshuffling-resampling map acting as 
(1.38) 

(if* \( <\ f i 1 ~ r )Vr),k + r $a, (£B k (r]), 

{®r,a,B k (T,)){X) = < u / j-j / \ r € [0, 1], a € -E, fc € N , ?7 € fijv- 

J C CfB k {rj), 



Note that the right-hand side of (1.37) is well defined due to (1.28). 



7 Reshuffling is a parallel update affecting all individuals in a macro-colony. Therefore it cannot be seen as 
a migration of individuals equipped with independent clocks. 
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Proposition 1.3. [Hierarchical martingale problem] 

Without assumption (1.3), for every Q G V(E)^ N , the martingale problem for (L^ n \J-,5q) 



is well-posed. The unique solution is a strong Markov process with the Feller property. 

The Markov process arising as the solution of the above martingale problem is denoted by 

x (n N ) = (x(^v)( t )) 

t>o, and is referred to as the -process on Jljy 

c A 

Remark: For the analysis of the C^~-process, the following auxiliary models will be impor- 
tant later on. Given K G No, consider the finite geographical space 

(1.39) G N:K = {0,...,N-1} K , 

which is a truncation of the hierarchical group £In after K levels. Equip Gn,k with coordinate- 
wise addition modulo N, which turns it into a finite Abelian group. By restricting the migra- 
tion and the resampling to Gn,k (i-e., by setting = and A& = for k > K), we obtain a 
Markov process with geographic space Gm,k that can be characterised by a martingale prob- 
lem as well. In the limit as K — > oo, this Markov process can be used to approximate the 
(^--process . 



1.5 Main results for N — > oo 



Our first set of main results concern a multiscale analysis of X^ N ' in the limit as N 
To that end, we introduce renormalised systems with the proper space-time scaling. 
For each k € No, we look at the k-block averages defined by 



oo. 



(1.40) Y$"\t) 



1 



C6B fe (*j) 



xf N \t), 



N , 



which constitute a renormalisation of space where the component r\ is replaced by the average 
in Bk(r/). The corresponding renormalisation of time is to replace t by tN , i.e., t is the 
associated macroscopic time variable. For each k G No and r/ G iljy, we can thus introduce a 
renormalised interacting system 



(1.41) 



(( 



t),k 



which is constant in Bk{n) and can be viewed as an interacting system indexed by the set 0$ 
that is obtained from SI by dropping the first /c-entries of r\ G £In- This provides us with a 
sequence of renormalised interacting systems, which for fixed N are however not Markov. 

we state the scal- 



Our main results are stated in Sections 1.5.1-1.5.5 In Section 1.5.1 



ing behaviour of the renormalised interacting system in ( 1.41 ) as N — > oo for fixed k G N 



In Section 1.5.2, we compare the result with the hierarchical Fleming- Viot process. In Sec- 



tions 1.5.3- 1.5.4 we identify the different regimes for k — > oo. In Section 1.5.5 we look at the 



interaction chain that captures the scaling behaviour on all scales simultaneously. 



1.5.1 The hierarchical mean-field limit 

Our first main theorem identifies the scaling behaviour of X^ N > as N — > oo (the so-called 
hierarchical mean-field limit) for every fixed block scale k G No- We assume that, for each N, 
the law of X^ N ^ (0) is the restriction to 0, n of a random field X indexed by Oqo = ^ that 
is taken to be i.i.d. with a single-site mean 9 for some 9 G V{E). 
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Recall (1.29). Let d = (dfc)fceN be the sequence of volatility constants defined recursively 



as 



(1.42) d = 0, 



d _ c k (\\ k + 4) 

+ c k + {\\ k + d k ) 



k e N . 



Let C denote law, let ==^ denote weak convergence on path space, and recall (1.19). 

Theorem 1.4. [Hierarchical mean-field limit and renormalisation] 

For every k £ No, uniformly in r\ £ floo, 



(1.43) C 



Y 



r),k 



(tN k ] 



t>o 



C 



Z, 



t>0 



The limiting process in (1.43) is a McKean-Vlasov process with drift constant c = c k and 
resampling measure d k 5o + A^. This shows that the class of Cannings models with block 
resampling is preserved u nder t he renormalisation. 

that the large-scale behaviour of X^ N ^ is determined by the 



We will see in Section 



1.5.5 



sequence m = (m k ) k( z?q with 
l^k + d k 



(1.44) m k 



where u. k = \\ k . 



We will argue that the dichotomy 

(1.45) m k = °° vs. ^2 mk < 

fceN fceN 



oc 



represents qualitatively different situations for the interacting system X^ N > corresponding 
to, respectively, 

• clustering (= formation of large mono- type regions), 

• local coexistence (= convergence to multi-type equilibria). 



(See Section 1.5.5 for more precise definitions.) In the clustering regime the scaling behaviour 
of dk is independent of do, while in the local coexistence regime it depends on do (see Sec- 
tion 



1.5.5). 



For the classical case of hierarchically interacting Fleming- Viot diffusions (i.e., in the 
absence of block reshuffling-resampling) , the dichotomy in (1.45) reduces to 



(1.46) £(l/cfc) = oo 

fceN 



vs. 



(1/Cfc) < oo, 



corresponding to the random walk with migration coefficients c = (c k ) k ^ being recurrent, 
respectively, transient. Moreover, it is known that in the clustering regime lim^oo a k d k = 1 
with o k = YaZoO-/ c i) f° r all d . 
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1.5.2 Comparison with the dichotomy for the hierarchical Fleming- Viot process 



Our second main theorem provides a comparison of the clustering vs. coexistence dichotomy 
with the one for the hierarchical Fleming- Viot process. Let 

(1.47) d* = (4) feeNo 

be the sequence of volatility constants when /xq > and fi k = for all k E N, i.e., there is 



resampling in single colonies but not in macro-colonies. By (1.42), this sequence has initial 
value cLq = ; 

(1.48) d{ = d x = —-- + -• "ef- 



cq^q 
CO + Mo ' 



d 



fc+1 



1 1 

c k d% ' 



whose solution is 
/to 



(1.49) d% 



1 + Motffc ' 



h G N, with cj/c 



fe-l 



Theorem 1.5. [Comparison with hierarchical Fleming- Viot] 

The following hold for (dk)keN : 

(a) The maps c i— > d and /vx i — >■ <i are component-wise non- decreasing. 

(b) d k > d* k for all k G N. 

( c ) E/teNo m k = oo if and only if EfceN () ( 1 /cfc) E/=o Ml = oo. 

(d) // lim^oo a k = oo and EfeeN °"fcM* < °°> then lim fc->oo = 1- 
In words, (a) and (b) say that both migration and reshuffling-resampling increase volatil- 



ity (recall (|L44}{05])), (c) says that the dichotomy in (IX .461) due to migration is affected 
by reshuffling-resampling only when the latter is strong enough, i.e., when EfceN ^ k = 00 > 
while (d) says that the scaling behaviour of dk in the clustering regime is unaffected by the 
reshuffling-resampling when the latter is weak enough, i.e., when EfceN °~k^k < °o. Note 
that the criterion in (c) shows say that migration tends to inhibit clustering while reshuffling- 
resampling tends to enhan ce clu stering. 

We will see in Section |ll.l| that in the local coexistence regime d k ~ Ez=o Mi as /c — ^ oo 
when this sum diverges and d k -> EzeN Mi/ 11^=1 (1 + m i) ^ (0> °°) when it converges. Thus, 
in the local coexistence regime the scaling of d k is determined the resampling-reshuffling. 

In the regime where the system clusters, i.e., EfceN mk = °°' ^ * s important to be able 
to say more about the behaviour of m k as k — > oo in order to understand the patterns of 
cluster formation. For this the key is the behaviour of d k as k — > oo, which we study in 



Sections 1.5.3-1.5.4 for polynomial, respectively, exponential growth of the coefficients c k and 
Afc. 

1.5.3 Scaling in the clustering regime: polynomial coefficients 

Our third main theorem identifies the scaling behaviour of d k as k — > oo in four different 
regimes, defined by the relative size of the migration coefficient c k versus the block resampling 



coefficient X k - The necessary regularity conditions are stated in (1.55-1.58) below. 
Define 



(1.50) lim ^ 

fe— >oo C k 



K E [0, oo] and, if K = 0, 



also lim k 

k— >oo C k 



L e [0,oo]. 
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Theorem 1.6. [Scaling of the volatility in the clustering regime: polynomial coef- 
ficients] 



Assume that the regularity conditions (1.55-1.58) hold. 
(a) If K = oo, then 

, die 

(1.51) lim — = 1. 



(b) If K £ (0,oo), then 

(1.52) lim — = M with M = \K 

k— >oo Cfo 

(c) If K = and L = oo, then 



-1 + ^1 + (4/K)] G (0,1). 



(1.53) lim 



1. 



fe^oo y/CkHk 

(d) If K = 0, L < oo and a G (— oo, 1), i/ien 



(1.54) lim o-fcdfc = JV with N = \ 



k— >oo 



1 + v 7 ! + 4L/(1 - a) 2 l G [1, oo). 



The meaning of these four regimes for the evolution of the population will be explained in 



Corollary 1.10 



Regularity conditions. In Theorem 1.6 we need to impose some mild regularity conditions 



on c and fx, which we collect in ( 1.55 - 1.58 ) below. We require that both and fik are regularly 
varying at infinity, i.e., 



(1.55) c fc ~ L c (k)k a , a G M, fi k ~ LJk)k b , b G 



— )• oo, 



with L C ,L^ slowly varying at infinity (Bingham, Goldie and Teugels [BGT87J Section 1.9]). 
The numbers a, 6 are referred to as the indices of c and /x|^J 

To handle the boundary cases, where Cfc, /ife, Hk/ck and/or k 2 fj,k/ck are slowly varying, 
we additionally require that for specific choices of the indices the following functions are 
asymptotically monotone: 



(1.56) 



a = : k h-> AL c (k)/L c (k), k \-> kAL c (k)/L c (k), 



6 = 0: i-)- AL^k)/L^k), k ^ kAL^(k)/L^(k), 
and the following functions are bounded: 
a = : fc^ kAL c (k) /L c (k), 
6 = 0: fc4 kAL^/L^k), 



(1.57) 



where AL(k) = L(k + 1) — L(fc). To ensure the existence of the limits in (1.50), we also need 
the following functions to be asymptotically monotone: 



(1.58) 



a = b : k i— )■ Lu(k)/L c (k), 

a = 6- 2: k ^ k 2 L^{k) / L c {k). 



8 Regular variation is typically defined with respect to a continuous instead of a discrete variable. However, 
every regularly varying sequence can be embedded into a regularly varying function. 
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1.5.4 Scaling in the clustering regime: exponential coefficients 



We briefly indicate how Theorem 1.6 extends when c\~ and jik satisfy 



Cfc = c k Ck, fJ*k = M Mfe with c, n G (0,oo) and (cfc), (//&) regularly varying at infinity, 



(1-59) Mfc 

A = hm — G [0, ooj 

fe— >oo Cfc 



and the analogues of (1.56-1.58) apply to the regularly varying parts. 



Theorem 1.7. [Scaling of the volatility in the clustering regime: exponential coef- 
ficients] 

Assume that ( 1.59| ) holds. Then: 



(A) [like Case (a)] c < /jl or c = (j,, K = oo: lirm^oo dk/ck = 1/c. 

(B) [like Case (b)] c = /j, K e (0, oo): lim^oo dk/ck = M with 



(1.60) M 



2c 



-(c(K + !)-!) + J(c(K + 1) - l) 2 + AcK 



(C) The remainder c>/x or c = \i, K = sp/iis into t/iree cases: 

(CI) [like Case (d)] 1 > c > /i or 1 = c> /i, lim^oo o~k = oo: limj( ! _ t . D0 Ofcdfc = 1. 
(C2) [like Case (b)] c = \i < 1, K = 0: lim^oo dfc/c& = (1 — c)/c. 
(C3) [like Case (c)] c = ^ > 1, K = 0: lim^oo dk/fik = l/(/i — 1). 

The choices 1 = c > /i, lim^oo cr& < oo and c > 1, c > /i correspond to local coexistence (and 
so does c = /i > 1, K = 0, E fcgNo /i&/c fc < oo). 

1.5.5 Multi-scale analysis: the interaction chain 

Multi-scale behaviour. Our fourth main theorem looks at the implications of the scaling 
behaviour of dk as k — > oo described in Theorems 1.5 1.6[ for which we must extend Theo- 
rem 1.4 to include multi-scale renormalisation. This is done by considering two indices (j,k) 



and introducing an appropriate multi-scale limiting process, called the interaction chain 

(1.61) MW = (Mf ) fc= _ (i+1)j ... i0 , j G No, 

which describes all the block averages of size N k indexed by k = — (j + 1), . . . , simulta- 
neously at time NH with j G No fixed. Formally, the interaction chain is defined as the 
time-inhomogeneous Markov chain with a prescribed initial state at time —(j + 1), 

(1.62) M® +1) =0eP(Ef), 
and with transition kernel 

(1.63) K k (x, •) = v%. k ' dk ' Ak (•), x G V(E), k G N , 

for the transition from time —{k + 1) to time —k (for k = j, . . . , 0). Here, V x d '^ is the unique 
equilibrium of the McKean-Vlasov process Zx ,d,A defined in Section 1.3.3 (see Section [4] for 
details). 
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Theorem 1.8. [Multi-scale behaviour] 

Let (tjv)iVeN be such that 

(1 64) nm = oo an d hm tff/N = 0. 

Then, for every j € No, uniformly in rj £ f^oo and £ (0, oo), k = 0, . . . ,j, 



C 



(1.65) 



Y^ k N \Nn N + N«u k ) 



k=j,...,0 



C 



M 



k=j,...,0 



Y, 



(Oat) 



(nH n ) 



Theorem 



1.8 



says that, as N — > oo, the system is in a quasi- equilibrium ^ k,dk,Ak n time 



scale NHn + N k u, with u S (0, oo) the macroscopic time parameter on level k, when x is the 
average on level k + 1 . 



The basic dichotomy. Our fifth main theorem lets the index in the multi-scale renormal- 
isation scheme tend to infinity and identifies how the limit depends on the parameters (c, A) . 



Indeed, Theorem 1.8 in combination with Theorems 1.5-1.6 allow us to study the universality 
properties on large space-time scales when we first let N — > oo and then j — > oo ^} 
The interaction chain exhibits a dichotomy, in the sense that 



(1.66) £ 



M, 



oo 

with vq either of the form of a random point measure, i.e., 

(1.67) uq = £[£[/], for some random U <E E with C[U] = 9. 
or vq spread out, i.e., 

(1.68) sup E^Var^V)] > 0, 

where B x = C h (E,R) n {tp: < 1} and 



(1.69) E Vt \yax x (i/>)] = / v e (dx)Vax x (i(>) 

JV{E) 



with 



(1.70) Var x (V0 



ExE 



[x(du)5 u (dv) — x(du)x(dv)] tp^tp^v). 



The first case is called the clustering regime, since it indicates the formation of large mono- 
type regions, while the second case is called the local coexistence regime, since it indicates 
the formation of multi-type local equilibria under which different types can live next to each 

other with a positive probability. In the local coexistence regime, a remarkable difference 

(i) 

occurs with the hierarchical Fleming- Viot process: mono-type regions for M^> as j — > oo 
have a probability in the open interval (0, 1) rather than probability 0. The latter is referred 



9 For several previously investigated systems, the limit as j — > oo was shown to be interchangeable (Dawson, 
Greven and Vaillancourt DGV95 , Fleischmann and Greven |FG94] ).) 
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to in |DGV95j by saying that the system is in the stable regime (which is stronger than 
local coexistence). In the present paper, we do not identify the conditions on c and A that 
correspond to the stable regime. The dichotomy can be conveniently rephrased as follows: 
There is either a trivial or a non-trivial entrance law for the interaction chain with initial state 
9 £ V(E) at time -oo. 



10 



We will show in Section [4.41 that 

3 



(1.71) E £[M o, ] [Va^WO] 



11 1 + m k 

.k=o K 



Var e 0/0, j eN o ,i>eC h (E,R),0eV(E). 



This shows that the entrance law is trivial when X^fceNo mk = 00 anc ^ non-trivial when 
EfceNo ™k<oo. 

Theorem 1.9. [Dichotomy of the entrance law] 

(a) The interaction chain converges to an entrance law: 



(1.72) 




fc=-0'+i).-,o. 



M, 



(oo) 



-oo,...,0 



(b) [Clustering] IfT,keN m k 

(c) [Local coexistence] // X^fceN 



oo ; then £[M ( (i) 



„ i —>C[5u\ with C[U] = 9. 

j—HX 

m k < oo, then sup^ eC . b(£; 



E £[M^] [Var:rW] >0 " 



Theorem |1.9| in combination with Theorem 1.5 (c) says that, like for Fleming- Viot diffusions 



we have a clear-cut criterion for the two regimes in terms of the migration coefficients and the 
resampling coefficients. 



Scaling of the variance. Our sixth main theorem shows what the scaling of d k in Theo- 



rem fL6]implies for the scaling of m k and hence of the variance in (1.71) (we will see in Sec- 
tion 11.3 that the conditions for Case (d) imply that lim^^ fi k o~ k = and lim^oo c k o~ k = oo). 

Corollary 1.10. [Scaling behaviour of mk] 



The following asymptotics of for k — > oo holds in the four cases of Theorem 1.6 



(1.73) 



(a) m k 
(c) m k 



oo, 



Cfc 



(b) m k - 
(d) m k 



K + M, 
N 



Ck^k 



0. 



All four cases fall in the clustering regime. For the variance in (1.71) they imply: (a) super- 



exponential decay; (b) exponential decay, (c-d) subexponential decay. 

Note that Case (d) also falls in the clustering regime because it assumes that a £ (— oo, 1 
which implies that lim^oo a k = oo. Indeed, l/c k a k = (&k+i ~ °~k)/o~ k , and in Section \l 1.1 
will see that 



we 



(1.74) lim a k = oo 



fc— >oo 



oo. 



fceN 



c k a k 



Combining Cases (a-d), we conclude the following: 

10 Recall that an entrance law for a sequence of transition kernels (Kk)%=-oo an d an entrance state 6 is any 
law of a Markov chain (Yk)\-_ oa with these transition kernels such that linifc^-oo Yj. = 0. 
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• The regime of weak block resampling (for which the scaling behaviour of is the same 
as if there were no block resampling) coincides with the choice K = and L < oo. 

• The regime of strong block resampling (for which the scaling behaviour of dk is different) 
coincides with K = and L = oo or K > 0. 

Note that M f 1 as if — >• oo, so that Case (b) connects up with Case (a). Further note 
that M ~ \[K as K J. 0, so that Case (b) also connects up with Case (c). Finally, note that 
y/ckjik ~ VLcfr/k as k — > oo for Case (d) by (1.50), while c^o^ ~ fc/(l — a) as A; — > oo when 
a G (— oo, 1) by (1.55). Hence, Case (d) connects up with Case (c) as well. 



Cluster formation. In the clustering regime, it is of interest to study the size of the mono- 
type regions as a function of time, i.e., how do the clusters grow? To that end, we look at 
the interaction chain M_) for j — > oo and level scaling k = k(j) for some k: N — > N with 
limy-^oo k(j) = oo, suitably chosen such that we obtain a nontrivial limit law. For example, in 
Dawson and Greven [DG93b] such a result was proved in the case of interacting Fleming- Viot 
processes when c is critically recurrent. Here, different types of limit laws and different types 
of scaling can occur, corresponding to different clustering regimes. Following Dawson, Greven 
and Vaillancourt [DGV95J and Dawson and Greven [DG96 , it is natural to consider a whole 
family of scalings k a : N — > N, a G [0,1], and single out fast, diffusive and slow clustering 
regimes, which are defined as follows: 

(i) Fast clustering: linx^oo k a (j)/j = 1 for all a. 

(ii) Diffusive clustering: In this regime, lini J '_> 00 k a (j)/j = n{a) for all a, where a i— > k{ol) 
is continuous and non-increasing with k(0) = 1 and n(l) = 0. 

(iii) Slow clustering: lim^oo k a (j)/j = for all a. This regime borders with the regime 
of local coexistence. 



Remark: Diffusive clustering similar to (ii) was previously found for the voter model on Z 2 
by Cox and Griffeath |CGr86| . where the radii of the clusters of opinion "all 1" or "all 0" 
scale as t a l 2 with a £ [0, 1), i.e., clusters occur on all scales a G [0, 1). This is different from 
what happens on Z , d > 3, where clusters occur only on scale a = 1. For the model of hier- 
archically interacting Fleming- Viot diffusions with = 1 (= critically recurrent migration), 
Fleischmann and Greven [FG94J) showed that, for all N £ N \ {1} and all ij E Qn, 



(1.75) C 



Y 



(t) 



ae[0,l) 



y(^iv) f log 



a 



ore [0,1) 



where (^^ JV H0)te(0,i] i s a time-transformed Fleming- Viot diffusion on V(E). A similar be- 
haviour occurs for other models, e.g. for branching models (Dawson and Greven [DG96 ). 

Our next two main theorems show which type of clustering occurs for the various scaling 
regimes of the coefficients c and \i identified in Theorems 



1.6 



1.7 



Polynomial coefficients 

allow for fast and diffusive clustering only. Exponential coefficients allow for fast, diffusive 
and slow clustering, with the latter only in a narrow regime. 



Theorem 1.11. [Clustering regimes for polynomial coefficients] 

Recall the scaling regimes of Theorem 1.6 



Renormalisation of hierarchically interacting Cannings processes 



24 



(i) [Fast clustering] In cases (a-c) ; the system exhibits fast clustering. 

(ii) [Diffusive clustering] In case (d), the system exhibits diffusive clustering, i.e., 

1 



(1.76) £ 



M 



0) 



■L(i-«)iJ 



a£[0,l) 



£ 



Z a I log 



1 — a 



R 



a£[0,l) 



where R = N(l — a) with N defined in (1.54) and a the exponent in (1.55). 



Theorem 1.12. [Clustering regimes for exponential coefficients] 

Recall the scaling regimes of Theorem\1.7\ 



(i) [Fast clustering] In cases (A, B, CI, C2), and case (C3) with lim^oo kfx^/ck = oo, 
the system exhibits fast clustering. 

(ii) [Diffusive clustering] In case (C3) with limfc_ s . 00 kfik/ck = C, the system exhibits 
diffusive clustering, i.e., ( |1.76| ) holds with R = C/(fi— 1). 

(iii) [Slow clustering] In case (C3) with kfik/ck x l/(logfc) 7 , 7 € (0, 1), i/ie system exhibits 
slow clustering. 



Note that (1.75) is a statement valid for all N G N \ {1}. In contrast, Theorems 1.11 



1.12 are valid in the hierarchical mean-field limit N — > oo only. What can we say about the 
clustering vs. local coexistence dichotomy in our model for finite iV? 



1.6 Main results for finite iV 

In this section, we take a look at our system X^ N ^ for finite N, i.e., without taking the 
hierarchical mean-field limit. We ask whether this system also exhibits a dichotomy of clus- 
tering versus local coexistence, i.e., for fixed N and t — > oo, does £[X^ N \t)] converge to 
a mono-type state, where the type is distributed according to 9, or to an equilibrium state, 
where different types live next to each other? 

Let Pt(-, •) denote the transition kernel of the random walk on iljy with migration coeffi- 
cients 



(1.77) (c fc + A fe+1 iV-(fc+i)) feeN 
starting at 0. Let 

poo 

(1.78) H N =Y, ^kN- k / P 2s (0,B k (0))ds, 

fceNo 



where -Bfc(O) is the /c-block in f2jv around (recall ( 1.22 )) and Pt(0, Bk(0)) = X^e-B fe (o) ^t(0, rj). 
We will see in Section 2.4.2 that Hn in ( 1.781) is the expected hazard for two partition elements 



in the spatial A-coalescent with block coalescence to coalesce. In particular, the second sum- 



mand in ( 1.77) is induced by the reshuffling in the spatial A-coalescent with block coalescence. 



Our last set of main theorems identify the ergodic behaviour for finite N. 



Theorem 1.13. [Dichotomy for finite N] 

The following dichotomy holds for every N £ N\{1}: 
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(a) [Local coexistence] // Hn < oo, then 



(1.79) liminf sup E m^) [Var x . (tp)] > 0, for all n e fijv- 



t— >oo 



(b) [Clustering] // Hjy = oo, then 

(1.80) lim sup E (f2 } [Vav x (ip)] = 0, for all rj £ £l N . 

This dichotomy can be sharpened using duality theory and the complete longtime be- 
haviour of X^ N ^ can be identified. 

Theorem 1.14. [Ergodic behaviour for finite N] 

The following dichotomy holds: 

(a) [Local coexistence] If Hn < oo, then for every 8 € V(E) and every X( Q Jv)(0) whose 
law is stationary and ergodic w.r.t. translations in fi^v and has a single-site mean 8, 



(1.81) £ \x^ N) (t) 



6 v{v{Ey 



for some unique law vx* N ^ & - that is stationary and ergodic w.r.t. translations in 
and has single-site mean 8. 

(b) [Clustering] If Hn = oo, then, for every 8 £ V{E), 



(1.82) £ \x^ N \t) 



The dichotomies in Theorems 
Hn = oo. 

1.7 Discussion 



1. 



e(du)8 (Su) a N £ V{V{E)^). 

coincide, i.e., J2keN mk = 00 an ^ on ^V 



£— J-oo J Q 

Theorem 1.15. [Agreement of dichotomy for N < oo and N = oo 



and 



1.14 



c A 

Summary. We have constructed the C^ - -process, describing hierarchically interacting Can- 
nings processes, and have identified its space-time scaling behaviour in the hierarchical mean 
field limit N — > oo (interaction chain). We have fully classified the clustering vs. local coex- 
istence dichotomy in terms of the parameters c, A of the model, and found different regimes 
of cluster formation. Moreover, we have verified the dichotomy also for finite N. Our results 
provide a full generalisation of what was known for hierarchically interacting diffusions, and 
show that Cannings resampling leads to new phenomena. 



Diverging volatility of the Fleming- Viot part and local coexistence. The growth of 
the block resampling rates (/ifc)fceN can lead to a situation, where, as we pass to larger block 
averages, the volatility of the Fleming- Viot part of the asymptotic limit dynamics diverges, 
even though on the level of a single component the system exhibits local coexistence. This 
requires that the migration rates are (barely) transient and the block resampling rate decays 
very slowly. An example of such a situation is the choice the choice Ck = fc(log/c) 3 and 
= 1/k which leads to d^ ~ log A; and ~ l//c(logA;) 2 as k — > oo. Thus, the system 
may be in the local coexistence regime and yet have a diverging volatility on large space-time 
scales. 
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Open problems. The results of Sections 1.5 and 1.6 suggest that a dichotomy between 
clustering and local coexistence also holds for a suitably defined Cannings model with non- 
local resampling on Z d , d > 3. In addition, a continuum limit to the geographic space M. 2 
ought to arise as well. The latter may be easier to investigate in the limit N — > oo, following 
the approach outlined in Greven |Gre05i. Another open problem concerns the different ways 
in which cluster formation can occur. Here, the limit N — > oo could already give a good 
picture of what is to be expected for finite N. A further task is to investigate the genealogical 
structure of the model, based on the work in Greven, Klimovsky and Winter |GKWpr| for the 
model without multi-colony Cannings resampling (i.e., A*. = 5q for k G N). 



Outline. Section [2] introduces the spatial A-coalescent with block coalescence and derives 
some of its key properties. Sections [3 -{TT use the results in Section [2] to prove the proposi- 
tions and the theorems stated in Sections ll.3Hl.6l Section [3] handles all issues related to the 
well-posedness of martingale problems. Section [4] deals with the properties of the McKean- 
Vlasov process. Section [5] outlines the strategy behind the proofs of the scaling results for the 



hierarchical Cannings process, which are worked out in Sections m49] Section 10 proves the 



scaling results for the interaction chain. Section 11 derives the scaling results for the volatility 
constant. Section [12] collects the notation. 



2 Spatial A-coalescent with block-coalescence 

In this section, we introduce a new class of spatial A-coalescent processes, namely, processes 
where coalescence of partition elements at distances larger than or equal to zero can occur. 
This is a generalisation of the spatial coalescent introduced by Limic and Sturm [LS06], 
which allows for the coalescence of blocks residing at the same location only. Informally, the 
spatial A-coalescent with block coalescence is the process that encodes the family structure 
of a sample from the currently alive population in the Cf^—- process, i.e., it is the process 

c A 

of coalescing lineages that occur when the evolution of the spatial CtJ-— -Cannings process 
is traced backwards in time up to a common ancestor. In what follows, we denote this 
backwards- in-time process by 

Two Markov processes X and Y with Polish state spaces £ and £' are called dual w.r.t. 
the duality function H : £ X £' — >• K if 

(2.1) E Xo [H(X t , Y )} = Ey [H(X , Y t )\ for all (X , Y ) G £ x £', 

and if the family {H(-,Yq): Yq G £'} uniquely determines a law on £. Typically, the key point 
of a duality relation is to translate questions about a complicated process into questions about 
a simpler process. This translation often allows for an analysis of the long-time behaviour of 
the process, as well as a proof of existence and uniqueness for associated martingale problems. 
If H{-, •) G Cb(£ x £'), and if H(-, Yj) and H(Xq, •) are in the domain of the generator of X, 
respectively, Y for all (Xq, Yq) G £ x £', then it is possible to establish duality by just checking 



a generator relation (see Remark 2.9 below and also Liggett |L851 Section II. 3]). 

The analysis of the processes on their relevant time scales will lead us to study a number 
of auxiliary processes on geographic spaces different from £1^. The duality will be crucial 



for the proof of Propositions 1.1-1.3 (martingale well-posedness) in Section [3j and also for 



statements about the long-time behaviour of the processes and the qualitative properties of 



their equilibria. In Section 2.1, we define the spatial A-coalescent without block coalescence. 



In Section [272], we add block coalescence. In Section 2.3 we formulate and prove the duality 
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2.4 



we look at the 



relation between the C^-process and the spatial A-coalescent. In Section 
long-time behaviour of the A-coalescent. 

2.1 Spatial A-coalescent without block coalescence 

In this section, we briefly recall the definition of the spatial A-coalescent on a countable 
geographic space G as introduced by Limic and Sturm |LS06| . (For a general discussion of 



exchangeable coalescents, see Berestycki [B09]). In Section 2.2, we will add block coalescence, 
i.e., coalescence of individuals not necessarily located at the same site. 

The following choices of the geographic space G will be needed later on: 



(2.2) Gn,k = {0, . . . ,N — 1} K , K,N <eN, G = Q n ,NgN, G = {0,*}, G = N. 



The choices in (2.2) correspond to geographic spaces that are needed, respectively, for finite 
approximations of the hierarchical group, for the hierarchical group, for a single-colony with 
immigration-emigration, and for the McKean-Vlasov limit. We define the basic transition 
mechanisms and characterise the process by a martingale problem in order to be able to verify 



duality and to prove convergence properties. In Section 2,l.l|we define the state space and the 



evolution rules, in Section 2.1.2 we formulate the martingale problem, while in Section 2.1.3 



we introduce coalescents with immigration-emigration. 

2.1.1 State space, evolution rules, graphical construction and entrance law 

State space. As with non-spatial exchangeable coalescents, it is convenient to start with 
finite state spaces and subsequently extend to infinite state spaces via exchangeability. Given 
n G N, consider the set 

(2.3) [n] = {l,...,n} 

and the set U n C 2^ of its partitions into partition elements called families: 

(2.4) Il n = set of all partitions ir = {tti }i=i of [n] into disjoint families tTj C [n], % G [b]. 

Thus, for any tt = {iTi} b i=1 G U n , we have [n] = Ui=i ^ii where Hi Pi TTj = for i,j G [b] with 
i ^ j. In what follows we denote by 

(2.5) b = b(ir) G [n] 

the number of families in ir G II n . 

Remark 2.1. By a slight abuse of notation, we can associate with tt G II n the mapping 
tt: [n] — > [b] defined as ir(i) = k, where k G [b] is such that i G vr^. In words, k is the label of 
the unique family containing i. 

The state space of the spatial coalescent is the set of G-labelled partitions defined as 

(2.6) U Gtn = |vr G = {(71-1,5-1), {1x2,92), fa, 9b)}' {^l, • • -,Kb} € II^, g 1} . .. ,g b G g\. 

For definiteness, we assume that the families of ttg G Hg,u are indexed in the increasing order 
of each family's smallest element, i.e., the enumeration is such that min7rj < mimrj for all 
i,j G [b] with i^j. 
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Let Sc, n £ ^G,n denote the labelled partition of [n] into singletons, i.e., 

(2.7) S G , n = [{{l},g 1 ),({2},g 2 ),...,({n},g n ): 9i eG,ie[n]}. 

With each ttq G Hc,n we can naturally associate the partition ir G Il n by removing the labels, 
i.e., with 

(2.8) ir G = {(iri, gt), (tt 2 , g 2 ), ■ ■ ■ , (^b, 9b)} 

we associate tt = {m, . . . ,irb} G II n . With each txq G Hg jTI we also associate the set of its 
labels 

(2.9) L(7T G ) = { gi ,...,g b } CG. 

In addition to the finite-n sets IL n and Hg,u considered above, consider their infinite 
versions 

(2.10) II = {partitions of N}, II G = {G-labelled partitions of N}, 
and introduce the set of standard initial states 

(2.11) S G = {{(W, 5t )W 9i G G,i G N}. 

Equip Tic with the following topology. First, equip the set Hg,u with the discrete topol- 
ogy. In particular, this implies that TL G ,n is a Polish space. We say that the sequence of 
labelled partitions {tTq^ G ilcj/teN converges to the labelled partition ttq G YLq if the se- 
quence {7r^|n G n^nlfegN converges to ~kg\u G for all n G N. This topology makes the 
space Tic Polish too. 

Evolution rules. Assume that we are given transition rates (= "migration rates") on G 

(2.12) a*: G 2 ->■ R, a*(g, f) = a(f, g), 



where a(-, •) is the migration kernel of the C A -process with geographic space G. The spatial n- 

A-coalescent is the continuous-time Markov process £^' loc = (^ G ' ) ' loc (t) = licit) G TlG,n)t>o 

with the following dynamics. Given the current state ttq = £^ ' (t—) G il G)n , the process 
g.(G),loc evo j veg v j a . 

• Coalescence. Independently, at each site g G G, the families of ttq with label g coalesce 
according to the mechanism of the non-spatial n-A-coalescent. In other words, given 
that in the current state of the spatial A-coalescent there are b = b(TTG,g) G [n] families 
with label g, among these i G [2,6]nN fixed families coalesce into one family with label 
g at rate A^, where 



(2.13) A^=/ A*(dry(l-r) b -\ ie[2,6]nN, 
7(0,1] 



with A* given by (1.5). 



Migration. Families migrate independently at rate a* , i.e., for any ordered pair of labels 
{9,9') S G 2 , a family of ttg with label g G G changes its label (= "migrates") to g' G G 
at rate a*(g,g'). 
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Graphical construction. Next, we recall the explicit construction of the above described 
spatial n-A-coalescent via Poisson point processes (see also Limic and Sturm [LS06]). 

Consider the family *p = {<}3 9 } 9gG of i-i.d. Poisson point processes on [0, 00) x [0, 1] x {0, 1} N 
defined on the filtered probability space (r2, J 7 , (Jj^o,?) with intensity measure 



(2.14) difg) A(dr)(r5i + (l-r)<5 o ) 0N (da;), 



where oj = (uji)i^ C {0, 1} . Note that the second factor of the intensity measure in (2.14) 
is not a product measure on [0, 1] x {0, 1} N , in particular, it is not the same as 

(2.15) [A*(dr)(n5i + (1 - r)6 )] m (dcj). 

Given J C [n] and g G G, define the labelled coalescence map coalj i9 : Hg,u — > n Gn , which 
coalesces the blocks with indices specified by J and locates the new- formed block at g, as 
follows: 

(2.16) coalj i9 (7r Gin ) = [J m,g U 7r Gin \ |J {ir%,gi) , 7r Gi „ G n Gjn . 

\ieJn[b(7r)] / \ ieJn[fe(7r)] / 

Using ^3, we construct the standard spatial n-A-coalescent (£^' loc = (Cn ' loc (t))t>o as a 
Markov II G]n -valued process with the following properties: 

• Initial state. Assume £^' loc (0) G Sg,u- 



• Coalescence. For each g G G and each point (t,r,co) of the Poisson point process ^3 

satisfying J^ieN^i — ^, ah families (7r» (t — ),g%{t— )) G <£n l0 °(i — ) such that gi(t—) = g 
and uji = 1 coalesce into a new family labelled by g, i.e., 

(2.17) < G )< loc (t) = coal {ieW: ^ =li9i(t _ )=9}i9 (Cf)' loc (t-)). 

• Migration. Between the coalescence events, the labels of all partition elements of 
€.^' loc (t) perform independent random walks with transition rates a* [^] 

In what follows, we denote by -| n : n Gjm — > n Gjn (respectively, -| n : IT G — > n Gi „) the 
operation of projection of all families in [m] (respectively, N) onto [n]. 

Entrance law. Note that, by construction, the spatial n-A-coalescent satisfies the following 
consistency property: 



(2.18) £ 



,r(G0,loc| 1 _ c r £ (G),loc 



n, m G N, n < m. 

Therefore, by the Kolmogorov extension theorem, there exists a process 
(2.19) £ (G) ' loc = {£^' loc (t) G n G ) t > 
such that £( G )> loc | n = e:i G),loc . 

Definition 2.2 ( |LS06| ). Call the process £( G )> loc i/ie spatial A-coalescent corresponding to 
the migration rates a* and the coalescence measure A. 



llr The adjective "between" is well defined because the set of points (t, r, w) of *P 9 satisfying the condition 
~}2 ieN Ui > 2 is topologically discrete, and hence can be ordered w.r.t. the first coordinate (= time). 
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2.1.2 Martingale problem 

In this section, we characterise the spatial A-coalescent as the unique solution of the corre- 
sponding well-posed martingale problem. 

Let Cq be the algebra of bounded continuous functions F : II G — > R such that for all 
F £ Cq there exists an n £ N and a bounded function 

(2.20) F n : Yl G ,n 

with the property that F(-) = F n {-\ n ). In words, F only depends on the family structure of 
a finite number of individuals. It is easy to check that C G separates points on II G . Given 
f,g £ G and i £ [n], define the migration map mig^_> ffii : T\-G,n ~^ ^G,n as 

(2.21) mig / _ >s>i (7T G ,n) = < , , 7TG,n. G n Gi „, 

[71"G,n, ¥ ^G,n, 

describing the jump in which the family labelled i migrates from colony / to colony g. 
Consider the linear operator L* G defined as 

(2.22) L* G = I/mig,G + ^coal,G> 

where the operators L* m ^ G , L* oal G : C G — >■ C G are defined for 7r G £ II G and F £ Cg as 

(2.23) (L^ igiG F)(7T G ) = E a*( 5 ,/)[F„(mig^ /)i ( 7 r G U))-F( 7 r G )], 

i=i s,/eG 



(2.24) (L* oaJ)GJ P)(7r G ) = £ ^ A^ Gk)S)i|J| [ J F n (coal J , g (7r G | n ))-F( 7 r G )]. 

geG Jc{ie[n]:gi=g}, 
|J|>2 

Proposition 2.3. [Martingale problem for the spatial-coalescent without block co- 
alescence] 

The spatial A-coalescent defined in Section 2.1.1 solves the well-posed martingale problem for 
(L G ,C h (U G ),5s G ). 

Proof. A straightforward inspection of the graphical construction yields the existence. The 
uniqueness is immediate because we have a duality relation, as we will see in Section [2 .3| □ 



Remark 2.4. Note that, instead of the singleton initial condition in Proposition 2.3 (and in 
the graphical construction of Section 2.1.1), we can use any other initial condition in II G . 



2.1.3 Mean-field and immigration-emigration A-coalescents 

Some special spatial A-coalescents will be needed in the course of our analysis of the hierar- 
chically interacting Cannings process. We define the mean-field A-coalescent as the spatial 
A-coalescent with geographic space G = {0, . . . , N — 1} and migration kernel a(i, j) = c/N for 
all i,j £ G with i ^ j. Furthermore, we define the A-coalescent with immigration-emigration 
as the spatial A-coalescent with geographic space G = {0, *} and migration kernel o(0, *) = c, 
a(*,0) = 0. In other words, * is a cemetery migration state. 
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2.2 Spatial A-coalescent with block coalescence 



In this section, we construct a new type of spatial coalescent process based on a sequence 
(Afc)^^^ of finite measures on [0,1], namely, the spatial A-coalescent on G = with block 
coalescence. For each k E N, we introduce two additional transition mechanisms: (1) a 
block reshuffling of all partition elements in a ball of radius k; (2) a A-block coalescence with 
resampling measure of all partition elements in a ball a radius k. In Section 2.2.1 we give 



definitions, in Section 2.2.2 we formulate the martingale problem. 



2.2.1 The evolution rules and the Poissonian construction 



We start by extending the Poissonian construction from Section 2.1.1 to incorporate the 
additional transition mechanisms of block reshuffling and block coalescence. 
Consider Poisson point processes *p( n Jv) on 



(2.25) [0, oo) x n N x N x [0, 1] x {0, 1} N 

defined on the filtered probability space (SI, (J r t)t>o,l 



with intensity measure 



(2.26) dt <g> dn ® (N~ 2k dk A k (dr)(r5i + (1 - r)5 { 



(d„; 



where u = (uji)^ C {0, 1} N , (t, rj, k, r, u) E [0, oo) x $7^ x No x [0, 1] x {0, 1} N , dk is counting 



measure on N and drj is counting measure on f2jy. Again, note that the third factor in (2.26) 



is not a product measure (compare (2.15)). 



Given S <s Qn (he., £ is a finite subset of J7tv) and £ = {£j}^ =1 C S, let reshs^: LT^ 
11(1 be the reshuffling map that for all z moves families from rji E S to £j E S: 



(2.27) resh Ei€ (7rn 



TTfijv G n niv , i E [&(tt[^ 



Let 

(2.28) C/ E = {C/s(0h6S 

be a collection of independent random variables uniformly distributed on S. Using (j(^ Ar ), 



we construct the standard spatial n- A- coalescent with block coalescence £ 



(d OAr) (i) G 



Ar,njt>0 



as the 11^ n - valued Markov process with the following properties: 



Initial state. Assume <£fP (0) E Sn 



Coalescence with reshuffling. For each point (t,r),k,r,uj) of the Poisson point process 
^ nN \ all families (7Tjjf?i) G (£— ) such that w» = 1 and r/j E B^(rj) coalesce into a 

new family with label n. Subsequently, all families with labels £ E Bf-(rj) obtain a new 
label that is drawn independently and uniformly from Bk{rf). In a formula: 

(2.29) <Lf»\t) = resh Bfc( „ )it , ocoal {ie[n]: Wt =i, w (^ )6 B fc (,)},,(24 nw) (*-)). 



Note that, in contrast with the spatial coalescent from Section 2.1, the coalescence 



mechanism in (2.29) is no longer local: all families whose labels are in Bk(rj), k E N, are 



involved in the coalescence event at site r\ E CIn- 
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Migration. Independently of the coalescence events, the labels of all partition elements 
of <£.^ N \t) perform independent random walks with transition rates a^ N \-,-) (recall 



(1.25)). 



As in Section 2.1 the consistency-between-restrictions property allows us to apply the 
Kolmogorov extension theorem to the family {CrP }neN to construct the Markov process 

(2.30) e (Qjv) 

taking values in Iln^ . 

Definition 2.5. The process <£y*N) is called the spatial A- coalescent with block coalescence cor- 



responding to the resampling measures (Ak)k&N (recall (1.27)) and the migration coefficients 



(cfc)fceNo (recall (1.24)). 



Proposition 2.6. [Feller property] 

The process is a cadlag strong Markov process with the Feller property. 

Proof. This is an immediate consequence of the Poissonian construction. 



□ 



2.2.2 Martingale problem 

In this section, we characterise the spatial A-coalescent with block coalescence as the solution 
of the corresponding martingale problem. 

Given itn N ,n £ ^n N ,n and B k (rj) C Sljv, denote the number of families of nn N>n with labels 
in B k (rj) by 

(2.31) b(TTn N)n ,B k (ri)) = {{(n^rji) G 7r QjVjn : ^ G B k (rj)}\ € N. 



Recall the definition of the algebra of test functions Cq from Section 2.1.2. Let 7Tq n = 
{(^i,Vi)}ien G F G Cq n and F(-) = F n (-\ n ). Consider the linear operator L^)* 

defined as 

(2.32) L {Un > = L^f* + Lf N> 



coal ' 



where the linear operators ^mig ana - -^coaT are d erme d as follows (recall (2.20)). The 
migration operator is 



(2.33) [L^*F) = £ £ C) [i ? »(inig l ^(*nj„)) - F(tt^)] 

and the block-coalescence-reshufning operator is 
(2.34) 



A 



JC[b(ir njVn ,B fc (T,))], 
|Jj>2 



(A*) 

MTG,n.Bfc(»?)),|J| 



[F„(resh Bfc(??)iC ocoal {ie j : ^eB^)}^ 7 ^!™)) ~ F {^n N )) ■ 
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Proposition 2.7. [Martingale problem: Spatial A-coalescent with block coales- 
cence] 

The spatial A-coalescent with block coalescence £( 0jv ) defined in Section 2.2 solves the well- 
posed martingale problem (l/^)* , Cq n , 5s n ) • 



Proof. A straightforward inspection of the graphical construction in Section 2.2 yields the 
existence of a solution. Uniqueness on finite geographic spaces is clear: this follows in the same 
way as for the single-site case. Once we have well-posedness for finite geographic spaces, we 
can show uniqueness for G = fi^y v i & approximation. The approximation via finite geographic 
spaces follows from the fact that the occupation numbers of the sites are stochastically smaller 
than in the case of pure random walks (see Liggett and Spitzer [LS8T]). □ 



Remark 2.8. Note that, instead of the singleton initial condition in Proposition 2.7 (and in 
the graphical construction of Section 2.2), we can use any other initial condition in Tln N . 



2.3 Duality relations 



We next formulate and prove the duality relation between the C|^-process and the A- 
coalescent described so far. This follows a general pattern for all choices of the geographic 
space G in (2.2). We only give the proof for the case G = Q,n- 

Recall (2.1 ). The construction of the duality function H(, •) requires some new ingredients. 
For n G N and ip G Cb(E n ,R), consider the bivariate function : V{E) G x n G>n — > K of 
the form 



(2.35) Hj?\x,ir G ,») = J &M (®^- lw (d 



where x = (x v ) veG G ViE) , 7r Gi „ G U G , n , b = 6(vr Gi „) = |7r Gjn |, (rji)ielb] = L(^G,n) are the 
labels of the partition 7T(j n , and (with a slight abuse of notation) tt: [n] — > [b] is the map from 
Remark 2.1 In words, the functions in (2.35) assign the same type to individuals that belong 
to the same family. Note that these functions form a family of functions on V{E) G , 

(2.36) {#(")(•, ir Gif 0: V(E) G -> R \ ^ n G U G<n , n G N, if e C h (E n , M)} , 

that separates points. The C^-process with block resampling and the spatial A-coalescent 
with block coalescence are mutually dual w.r.t. the duality function H(-,*) given by 

(2.37) H (x, (if, 7r Gin )) = H v (x, 7r G>n ), x G £ = V(E) G , (ip, 7r G , n ) G £' , 

with £' = U nm) {C h {E n ,R) x n G , n ). 

We proceed with the following observation. 

Remark 2.9. Let X and Y be two processes that are dual w.r.t. a continuous and bounded 
duality function H(-,-). Assume that X and Y are solutions to martingale problems corre- 
sponding to operators Lx, respectively, Ly . Then the generator relation 

(2.38) [L X (H(;Y Q ))](X ) = [L Y (H(X ,-))](Y ), (X ,Y ) 6fxf, 



is equivalent to the duality relation (2.1) (see e.g. Ethier and Kurtz [EK86, Section 4.4]j. 
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Remark 2.10. Remark |2.9| gives the duality function H(-, *) for all t > and n£N, 



(2.39) E te n )(X( G )(t),(£( G )(0)| ? 



E 



H( n \x( G \0),^ G \t)\, 



as is proved in Proposition 2. 1 1| below. 



In our context, we have to verify the following relation for the linear operators in the 
martingale problem. 

Proposition 2.11. [Operator level duality] 

For any of the geographic spaces G = Qn, G = {0, . . . , N — 1} , K G N and G = {0, *} the 
following holds. For all as in Q2.35Q , all x G V{E) G , all n G N, and all tt g G U g , 

(2.40) [L^H^{.^ G \ n )) (x) = (L( G )*^)(x,-| n )) (tt g ). 

Proof. We check the statement for G = Ojy- The proof for the other choices of G is left to 
the reader. 

The claim follows from a straightforward inspection of (1.36-1.37) and (2.33 2.34), respec- 
tively. Indeed, duality of the migration operators in ( 1.36| ) and (2.33) is evident: 



(2.41) (Lg4»)(.,7r G |„; 



L 



2>jffW(s,.| f 0) (tt g ). 



Let us check the duality of the resampling and coalescence operators in (1.37) and (2.34). 
By a standard approximation argument, it is enough to consider the duality test functions 
in (2.35) of the product form, i.e., with (f(u) = IliLi <Pi( u i)i where u = (ui)f =1 G E n and 
(fii G Cb(E). Using (1.37), (2.13) and simple algebra, for x G V(E) G and ir G G IIg we can 
rewrite the action of the resampling operator on the duality test function as follows (where 
for ease of notation we assume that ttq G Sg, i-e., ttg has the singleton family structure) 



t£»>4">(,TC|„)) (!) 



EE" 

7/eGfceNo 

f K^G,n,B k (ri)) 

n 



-2k 



[0,1] 



Al(dr)N 



—k 



i=l 



„ ' n ^ 



6(7TG,n:-Bfc('?)) 

n < 



i=l 



j- T(i)=i 



£ £ iV- 2fc / A* fc (dr)iV- fe £ f x p (d 



PGB fc (r;)' 



E 



n 



JC[6(7r G , n ,B fc (»?))] je[b( 7 r G , n ,B fc ( }7 ))]\J 
|J|>0 



'! - r )^-i (i) ,*» n ^)n( r<5a ' n ^ 

j: ?r(i)=i / i£J \ j: 7r(j)=« / 



- II { 



i=l 



n ^ 
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E E N ~ 2k E 



7?eGfceN Jc[b(7T G , n ,B k ( v ))], 

\J\>2 



at- z n 



^ fe e n ^)n(^ n ^ 



P&B k {r))i&[b{ir G , n ,B k {ri))}\J \ ^eB k {r)) j:n(j)=i I i£J \ j : n(j)=i 



(2.42) 



K"KG,n,Bk(rj)) 



n 

i=l 



.r 



9vr-l(i)' n ^ 

j: jr(i)=* / 



On the other hand, according to ( |2.34[ ), we have 

[Lttm^\n))i- G )= e e^ ; 



V- x (Ak) 

A b(7T G:n ,B k ( v )),\J\ 
V en N k&N JC[b(n G ,n,B k ( V ))], 
\J\>2 



( 



E N ~ k E ^ 

\£ieSfe(»7) &eB k {r,) 



' ' n n ^) ( x € mi n {;: , 6J >> n ^ 

V»e[fe(7r Gi „, J B fe (r;))]\J \ j: 7r(j)=i / \ j: 7r(j)eJ / 



(2.43) 



b(TTG,n,Bk{v)} 



n 

i=i 



3", 



j: 7r(j)=i / 



Comparing (2.43) with (2.42), we get the claim. 



□ 



2.4 The long-time behaviour of the spatial A-coalescent with block coales- 
cence 

We next investigate the long-time behaviour of the A-coalescent. Subsequently, the duality 
relation allows us to translate results on the long-time behaviour of the A-coalescent into 
results on the long-time behaviour of the Cf^-process. 



2.4.1 The behaviour as t — > oo 

In this section, we prove the existence and uniqueness of a limiting state for the A-coalescent 
as t — > oo. 

Proposition 2.12. [Limiting state] 

Start the <tj^--process in a labelled partition {(^i,rji)}f =1 , where {7Tj}™ =1 form a partition o/N 
and {r/j}f =1 are the corresponding labels. If x is a translation-invariant shift-ergodic random 
state with mean 6 £ V(E), then 



(2.44) c\H^\xXf N) {t)) 



t— >oo 



c 



4 n > (oo)) 



Vn e N. 
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Proof. We first observe that |CrP N ^(t)| is monotone non-increasing, so that there exists a limit 
for the number of partition elements. This implies that the partition structure converges a.s. 
to a limit partition, which we call <t^ N ' n \oo) £ Hn N n- We must prove that the locations 
result in an effective averaging of the configuration x, so that we can replace the | deal- 
locations by any tuple for the (constant) configuration 9. This is a standard argument (see 
e.g. the proof of the ergodic theorem for the voter model in Liggett [L85j). □ 



Corollary 2.13. [Limiting state of McKean-Vlasov] The convergence in (2.44) holds for 

ryC,d,K 

z, e 1 . 

2.4.2 The dichotomy: single ancestor versus multiple ancestors 

c A 

The key question is whether the ^"-process converges to a single labelled partition element 
as t — > oo with probability one or more than one partition element occurs with positive 
probability. For that purpose, we have to investigate whether two tagged partition elements 
coalesce with probability one or not. Recall that, by the projective property of the coalescent, 
we may focus on the subsystem of just two dual individuals, because this translates into 
the same dichotomy for any n-A-coalescent and hence for the entrance law starting from 
countably many individuals. However, there is additional reshuffling at all higher levels, which 
is triggered by a corresponding block-coalescence event. Therefore, we consider two coalescing 
random walks (Zl,Z 2 ) t >o on Sl^r with migration coefficients (ck + A^iiV^^ 1 ))/^^, and 
coalescence at rates (Xk)km ■ Consider the time-t accumulated hazard function for coalescence 
of this pair: 



(2.45) H N (t) = X kN- k f 1 {d(ZlZ 2 s ) < k) ds. 
feeNo 



Here, the rate ./V 2k to choose a /c-block is multiplied by N k because all partition ele- 
ments in that block can trigger a coalescence event. This yields factor N~ k in (2.45). We 



have coalescence of the random walks (= single common ancestor) with probability one if 
lim^oo H^(t) = Hn(oo) = oo a.s., but separation of the random walks (= multiple ances- 
tors) with positive probability if iJjv(oo) < oo a.s. 

Lemma 2.14. [Zero-one law] Hn = H^{oo) = oo a.s. if and only if Hn = E[-Hjv(°°)] = °°- 
Proof. Write = X^fceNn w kL(k) with Wk = Y2i>k and the local times L(k) = 



J °° l{d(Zl, Z 2 ) = k}ds. Note that Wk < oo because of condition (1.31). By the isotropy 



of the transition kernel of the hierarchical random walk (1.25), we have that L(k), k € No, are 



independent random variables. Hence, the claim follows from the Borel-Cantelli lemma. □ 

Let Pt(-, •) denote the time-t transition kernel of the hierarchical random walk on f2j\r with 
migration coefficients (c^)keN , where cjj, = Ck + Afc + iiV~( fc+1 ). Then 

poo poo 

(2.46) H N = ^N- k / P 2s (0, B k (0)) ds = ^ VkN~ k / P t (0, B k (Q)) dt. 
feeN fceN 



From the formulae in Dawson, Gorostiza and Wakolbinger [DGW05, Section 3.1] we obtain, 
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after a little computation, 



1 

P t (0,n)dt= — 
(2.47) Jo D N 



O 



N 



+ 1 



7 l>k 



N 



N 



Vr? G Oat with d(0,r/) = fceN, 



where 



(2.48) D„ = £ a^H^-A^)^ J2l= E ^ 



keN 



l>k 



met 



N' 



1 



_/ym+l 



is the jump rate of our random walk (recall ( |1.25 )). The factor 1/-Djv in (2.47) is needed 
because the random walk in [DGW05] has jump rate 1. Note that the sums in (2.47-2.48) are 



finite because of condition (1.26). (The formula for J °° Pt(0, 0) dt can be easily deduced from 
(2.47), but this will not be needed.) 



It follows from (2.47-2.48) that 



lim Dn = co, 



(2.49) 



N 



Hence, lim 



lim iB^XBfc-iCO)!" 1 [°° P t (0,B fc (0)\B Jfe _i(0))dt = - Vc, 
I-** Jo co J 



-i 



k £ N. 



P^at = oo if and only if £fc G No Mfc Ez>* ! ( 1 / c 
oo, the condition in Theorem 



oo, which is the same as 



EfceN (V c fc) Yli=o^l = 00 ' * ne condition in Theorem 1.5 'c). Thus, we see that, in the limit 
as — 7- oo, the condition for clustering in the hierarchical Cannings process is the same as 
the condition for coalescence in the hierarchical Cannings-coalescent, in line with the duality. 
In fact, the above formulae tell us more: the same dichotomy holds for finite N. This is 



because, uniformly in iV £ N\{1}, the quantities in (2.49) are bounded from above and below 
by a constant times their limit as TV — > oo. 



3 Well-posedness of martingale problems 



Our task in this section is to prove Propositions 1.1-1.3 i.e., we have to show that the 



martingale problem for the single-colony process, the McKean-Vlasov process, the multi- 
colony process and the hierarchically interacting Cannings process are all well-posed (= have 
a unique solution). The line of argument is the same for all. In Section 3.1 , we make some 



preparatory observations. In Section 3.2, we give the proofs 



3.1 Preparation 

We first show that the duality relation and the characterisation of the dual process via a 
martingale problem allow us to prove the existence of a solution to the martingale problem 
that is strong Markov and has cadlag paths. To this end, observe that via the dual process 
we can specify a distribution for every time t and every initial state, since the dual is a unique 
solution of its martingale problem (being a projective limit of a Markov jump process defined 
for all times t > 0). Since the family {H(-,Yq): Yq G £'} separates points, this uniquely 
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defines a family of transition kernels (Pt, s )t>s>o satisfying the Kolmogorov equations, and 
hence defines uniquely a Markov process. By construction, this Markov process solves the 
martingale problem, provided we can verify the necessary path regularity. 

We need to have cadlag paths to obtain an admissible solution to the martingale problem. 
For finite geographic space this follows from the theory of Feller semigroups (see Ethier and 
Kurtz [EK86, Chapter 4]). For fijy, we consider the exhausting sequence (Bj(0))j & ^ and use 
the standard tightness criteria for jump processes to obtain a weak limit point solving the 
martingale problem. The essential step is to control the effect on a single component of the 
flow of individuals in and out of Bj(0) in finite time as j — > oo. 

It is standard to get uniqueness of the solution from the existence of the dual process (see 
e.g. [EOO l Section 1.6] or [EK86, Proposition 4.4.7 and Theorem 4.4.11]). Again, this works 
for all the choices of G in (2.2), with a little extra effort when G = Qn- 



3.2 Proofs of well-posedness 



In the section, we prove Propositions 1.1-1.3 We follow the line of argument of Evans |Ev97| 
Theorem 4.1] and derive existence and uniqueness of the spatial Cannings process from the 
existence of the corresponding spatial Cannings-coalescent established in Section [2} The main 
tool is duality (cf. Proposition 2.11). The proofs of Propositions 1.1-1.3 follow the same 
pattern for G = {0, . . . , N - 1}, G = {0, *} and G = Q N . 

Proof of Propositions 1.1 - 1.3[ 

• Well-posedness. First we show that there exists a Markov transition kernel Qt on V(E) G 
such that, for all <p G C h (E n ,R), vr G U G , n , X G V{E) G and t > 0, 

(3.1) J Q t (X,dX')HW(X',n) 



E 



4")(x,<4 G )(t))ief)(o) 



7T 



Once (3.1) is established, the general theory of Markov processes implies the existence of a 
Hunt-process with the transition kernel Qt (see e.g. Blumenthal and Getoor [BC68I Theo- 
rem 1.9.4]). This cadlag process is unique and coincides with the process since rt3.l| 
implies (2.39). There can be at most one process satisfying (2.39), since the family of duality 



functions i/<^(-,7r) separates points on V(E) G . 

• Feller property. To show that X^ is a Feller process we use duality. It is enough to show 
that, for any F G T and any t > 0, the map 



(3.2) V(E) G Bi^E iFiX^it)) | A"( G )(0) = x 



G 



is continuous. In (3.2), instead of the test functions F(-) G F, it is enough to take the duality 
test functions (•, HG,n) from ( |2.35[ ). The duality in ( |2.39[ ) implies that 



(3.3) E \H^(X^ G \t),^ n \ n ) | X^ G \0) 



E 



4™)(*,e:( G )(t)| ? 



t > o. 



Definition (2.35) readily implies that the right-hand side of (3.3) is Lipschitz in x. 



□ 



Renormalisation of hierarchically interacting Cannings processes 



39 



Properties of the McKean-Vlasov process with immigration- 
emigration 



The purpose of this section is to show that the Z^' d ' A -process with immigration-emigration is 



ergodic (Section 4.1 ), to identify its equilibrium distribution in terms of the dual (Section|43 



and to calculate its first and second moment measure (Section 4.4). The characterisation via 
the dual will allow us to also show that the equilibrium depends continuously on the migration 



parameter 9 (Section 4.2), a key property that will be needed later on and for which we need 



that the A-coalescent is dust-free (recall (1.3)) 



4.1 Equilibrium and ergodic theorem 



The equilibrium v 



(4.1) UL C /> A F V 



:,d,A 



G V(V(E)) is the solution of the equation 
0, cp G C h (E n ), ra G N, 



where we recall (1.15 1.17 ) for the form of F, 



Proposition 4.1. [Ergodicity] 



For every initial state Zg' a '^(0) G V[E), 



(4.2) £ 



c,d,K 



t—>-oo 



and the right-hand side is the unique equilibrium of the process. The convergence holds uni- 
formly in the initial state. 

Proof. We use the dual process to show that the expectation in the right-hand side of the 



duality relation (2.44) converges. Indeed, we showed in (2.44) in Proposition 2.12 and its 



Corollary 2.13 that the state of the duality function H(Xq, •) applied to the dual process 
converges in law to a limiting random variable as t — > oo. The duality function viewed as 
a function of the first argument generates a law-determining family {H(-,Cq): Cq G 11} and 



hence (2.44) proves convergence. 



It remains to show that the limit is independent of the initial state. Indeed, this is implied 
by the fact that if we start with finitely many partition elements, then all partition elements 
eventually jump to the cemetery location {*} where all transition rates are zero and the state 
is 9. The latter implies that the limit is unique. Since V{E) is compact and the process is 
Feller, there must exists an equilibrium, and this equilibrium must be equal to the t — )■ oo 
limit. □ 



4.2 Continuity in the centre of the drift 

We want to prove that (in the weak topology on the respective spaces) 



(4.3) 



c,cf,A 



is uniformly continuous. To this end, we need to show that, for every / G C-^iJ^iE), M), 9 \-¥ 
( u e' ' ' /) ^ s uniformly continuous. Since the family {H(-, Cq), C G 11} is dense in Cb(V(E), M), 
we can approximate / by duality functions in the sup-norm. It is therefore enough to show 
uniform continuity for the duality function (uniformly on the family). For this purpose, we 
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analyse the limiting random variable for the dual as a function of 9 in the limit as t — > oo. 
The necessary uniformity follows once we establish the existence of an entrance law starting 
from ({1},{2},...). 

Assume first that A is such that the single-site A-coalescent comes down from oo. Namely, 
if we denote by (C^)t>o the entrance law at time from the state {1}, {2}, • • • of the A- 
coalescent with jumps to the cemetery {*} at rate c, then H(9, C£i ) determines the McKean- 
Vlasov limiting law as t — > oo uniquely. Recall that we associate the value 9 with the cemetery 
state. The fact that C C ' A is an entrance law follows from the projective property and the fact 
that 

(4.4) the A-coalescent comes down from oo at the site (not at {*}!). 



Namely, subject to (4.4), on the states, where rates are positive, we only have finitely many 



partition elements, and it is clear that = fim^oo exists. The random variable 

C% has partition elements that are all located at the cemetery state, and hence this holds 
uniformly for all coalescents starting in n G N partition elements. 
Let 

(4.5) P n , k = F{\C^\ = k | C C ' A = {{1}, . . . , {n}}}. 

Then, for all H(9, ({1}, . . . ,{n}}) = (9,f) n , 9 G V(E) and / G C h (E,R), we have 



(4.6) H(9,C^ A ) = Y / Pn,k(0J) k - 



k=l 



This function is uniformly continuous in 9 for any given parameter n G N. 

The continuity property is now immediate, since the algebra generated by the monomials 
forms a dense subset in ^(V (E) ,M) . 



Remark 4.2. By Limic and Sturm [LS06, Theorem 12], under the dust-free condition (1.3), 

(4.7) lim P n , fc = P k , VPoo,fc = 1. □ 

ken 

4.3 Structure of the McKean-Vlasov equilibrium 

In the case of the McKean-Vlasov Fleming- Viot processes, the equilibrium v C Q d ^° can be 
identified as an atomic measure of the form 

i-l 

1 



i— i 

(4.8) Y^^Hil-Wj 



jgN j=l 



with (C/i)igN i i-d. ^-distributed and (Wi)ieN i-i-d- BETA(1, |) -distributed, independently of 
each other (cf. |DGV95j ). What we can say about the equilibrium Ug' d ' A ? 

Proposition 4.3. [Representation of McKean-Vlasov equilibrium] Let Vg' d ' A be the 
equilibrium of the process Zg' d ' A = (Zg' d '^(t))t>o with resampling constant d and resampling 



measure A G M.f([0, 1]). Assume that A is dust-free (recall (1.3)) 
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(a) The following decomposition holds: 

Here, (V^jgpj and (Ui)i & ^ are independent sequences of random variables taking values 
in [0,1], respectively, V(E). Moreover, (JJi)i^ is i.i.d. with distribution 6 , ^ieN ^ = 
a.s. and 

i-1 

(4.10) V i = W i l\(l-W j ), 

3=1 

where 



(4.9) v c /' A = C 



(4.11) (Wj) jm 

is i.i.d. [0, l]-valued with some distribution p. This distribution is uniquely determined 
by the moment measures ofvQ d ' A (which can be expressed in terms of the dual coalescent 
process) and depends on c, d and A. 

(b) If 8 M = {5 U : tie E} and c,d>0, then 

(4.12) < Ug AA (M) < 1. 



Proof. 

(a) The distribution and the independence of (Ui)i^ follow from the representation of the state 
at time t £ [0, oo] in terms of the entrance law of the A-coalescent starting from the partition 



({1}, {2}, • • • ). This representation is a consequence of the duality relation in (2.40) and de 



Finetti's theorem, together with the dust-free condition on A in (1.3), which guarantees the 
existence of the frequencies of the partition elements at time t. Indeed, every state, including 
the equilibrium state, can be written as the limit of the empirical distribution of the coalescent 
entrance law starting from the partition {{1}, {2}, • • • } at site 1, where we assign to each dual 
individual the type of its partition element at time oo, drawn independently from 9, the 
cemetery state. Here we use the fact that if we condition individuals not to coalesce with a 
given individual, respectively, its subsequent partition element, then the process is again a 
coalescent for the smaller (random) subpopulation without that individual, respectively, its 
subsequent partition element. 

The (Vj)jzN are the relative frequencies of the partition elements ordered according to their 
smallest element. By construction, (Vi)^ and (Ui)^ are independent. The i.i.d. property 
of {Wj)j£H follows from the property that, after we exclude the first j partition elements 
from the population we are left with a A-coalescent entrance law starting from a countable 
ordered population. More precisely, pick the atom corresponding to U\ and call its weight 
W\ . Let Wi be the fraction of the remaining mass assigned to the atom corresponding to TJi , 
etc. This defines the sequence (Wj)j^. From the representation of the state at time t and of 
the equilibrium state via the coalescent process starting from the partition ({1}, {2}, . . . ), we 
conclude that (Wj)jeN is i-i.d. It remains to identify the law p of W. 

In principle, via the duality we can express the moments in equilibrium 



(4.13) E uC , d , A [(XJ) n ] 
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in terms of (9, f) k , k = 1, . . . , n, and the coalescence probabilities before the migration jumps 
into the cemetery state. The latter in turn can be calculated in terms of 



(4.14) c, d, r k (l 



\n—k 



A(dr). 



These relations uniquely determine the statistics of the atom sizes, which in turn uniquely 



determine the marginal distribution of the Wj's via (4.10). 



Remark 4.4. In the case where A = 5o (the McKean-Vlasov Fleming-Viot process) it is 
possible to identify the law ofW as the BETA(1, ^-distribution. It remains an open problem 



to identify the law of W in general as function of the ingredients in (4.14). This is more 
complex task because of the presence of the measure A. 

(b) First consider the case A = Sq. Let us verify that, for c > and 9 ^ M, there can be no 
mass in M. Indeed, if there would be an atom somewhere in M, then there would also be 
an atom in M after we merge types into a finite type set. However, in the latter situation 
the Wj's are BETA-distributed, hence do not have an atom at or 1, and so also the law 
of the Vi's has no atom at or 1. This immediately gives the claim, because it means that 
is c /' A (M) = 0. 

Next consider the case A ^ So. Then new types keep on coming in. We need to prove that 
the event that ' contains more than one partition element has a positive probability. 
But this is obviously true when c, d > 0. □ 

4.4 First and second moment measure 

We can identify the first and second moments of the equilibrium explicitly, and we can use 

(i) 

the outcome to calculate the variance of Mj: for k = 0, . . . , j, the interaction chain defined 
in Section 1.5.5). Recall the definition of ~K Vg [Var x (^)] from (1.69). 



Proposition 4.5. [Variance] For every if) £ C\,(E), 

(4.15) E^a [Var x (</>)] = ^ ^ e AA (dx) [{^,x) - (</>, x) 2 ) = 2c+ ^ + u W 

Proof. We calculate the expectation of {(p,x), if G Cb(E), and (ip 
equilibrium. 



2c 



^ E C h {E 2 ), in 



It follows from (4.1) with u = Ug' d ' A that 



(4.16) n = l, tpeC h (E): = c u(dx) {<p, (9 - x)) , 

Jv(E) 

i.e., f-pr E \ u(dx) {tp,x} = {(f,9). It further follows that 

n = 2, if G C h (E 2 ): = -2c / v(dx) (if, x 02 ) 

JV{E) 

u(dx) [{tp, 9 x) + (93, x 9}} 



+ c 



(4.17) 



IV(E) 



+ 2d 



JV(E) 



+ X I i> 

JV(E) 



v{dx) (J x(da)(<p,Sf 2 )-(<p,x' s2 
(dx) f x(da)U(8 a -xf 2 \. 

J E 1 
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We can rewrite this relation as 

v{dx) / x(da) (^tp, (5 a - x) 02 ^ 



V(E) 



(4.18) 



T(E) 

c 



v(dx)( x(da)(ip,5 r f)-(p,x® 2 ) 



2 u(dx) (ip, x® 2 ) - / v{dx) [(<p, e®x) + (y,x® 9)] 
a + la \ J<p(E) Jv{E) 



From this, we see that 



/ v{dx) 
Jv(E) 



X + 2d 



(4.19) 



2c + A + 2d \ A + 2d 



V(E) 



v{dx) [(cp, 9 x) 



+ (ip,x®9)] + / u(dx) / x(da)(^,5f) 

Jv(E) JE 



X + 2d 1 2< ' {^)+f 9(da)(<p,6f 

Je 



2c + A + 2d V A + 2d 



where we use (4.16) in the last line. Substituting this back into (4.18) and using (4.16) once 
more, we get 



u(dx) / / Q x (du, dv ) (p(u, v) 
v(E) Je Je 



(4.20) 



V(E) 

2c 



u(dx)( / x(da) (^f } - (y,,^ 2 ) 



A + 2d yj'pfE) 
2c 

2c + A + 2d \J E 
2c 



v{dx){ V ,x m )-{ip,9® 2 ) 



2c + X + 2dJ E 




0(da)< V ,O-<V,O 



Qg(du,dv) (f(u,v). 



Pick (p = ip X ip in (4.20) to get the claim. 



□ 



For A = A([0, 1]) = 0, (4.15) is the same as Dawson, Greven and Vaillancourt [DGV95[ 
Eq. (2.5)]. 

Corollary 4.6. [Asymptotic variance of entrance law] 

For ip £ Cb(-E,M), the interaction chain satisfies 

(4.21) J™ E c(M^ ) )^ Vaix ^ = { res P ectivel V, > o), 



if ^2keN m k = 00 (respectively, ^2keN m k <o°) with defined in (1.44). 
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Proof. From (4.15), we have the formula 

2c 



(4.22) E c ,d,A [Var x ((/?)] 



2c + \ + 2d 



Vax$((p). 



Hence, we have the relation (recall (1.63) for the definition of K k (0, dx)) 

(4.23) / K k (9, dx) Var x (<p) = 2Cfc Var e (y), 

which says that in one step of the interaction chain the variance is modified by the factor 

(4.24) n k = ^ = -J—. 

2c fc + Afc + 2d fc 1 + m fc 

Iteration gives 



\/c — / \fc — / 

Therefore, taking logarithms, we see that (4.21) is equivalent to 

(4.26) rrik = oo (respectively, < oo). 

fceNo 



Var e (ip). 



□ 



We next prove a result that is similar to, but more involved than, [DGV95J, Eq. (6.12). 
This result is necessary for the proof of Theorem |1.11 on diffusive clustering. 

Proposition 4.7. [Variance of the integral against a test function] For every tp 6 
C h (E), j € N and < k < j + 1, 



(4.27) 



Var £(M«) (( x ^)) =E z;(m«) [(x '^ 2] 



Var e (ip) . 



l=i+l 



+ mi 



Proof. The proof uses the following two ingredients. Combining (4.15) and (4.24), we have 

1 



(4.28) E c^^tVar^)] 



1 + m k 



Var e (V>). 



The first and the third line of (4.20) yield 

(4.29) Var„((x,^)) = ^^E y [Var^)]. 

2c 



Together with (4.15) and (1.42), we therefore obtain 

(4.30) Var dfc , Afc (M>) = - - * ± 2 *j fc Var^) = Var e (V>). 
y e 2c fc + A fc + 2d fc c fc 
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Fix j E N. The proof follows by downward induction over < k < j + 1. The initial case 

(i) \ k 

k = j + 1 is obvious because M_/. +1 % = by (1.62). Let us therefore assume that the claim 
holds for k + 1. By ( fTT62}|lT63| ) , 

(4.31) 



c J _ 1 ,d J „ 1 ,A J _ 1 



7>(£) 



7>(£) 



(d%_ x ) . . . / ^^(d0*)<0*, V>> a " 



Next, use (|4.30|) to rewrite the inside integral as 
(4.32) 



V(E) 



Cfc 



Substitute this back into (4.31), to obtain 



(4.33) 



+ 



c fc 



c,-,d,,A, 



7>(£) 



(d%) 

^c fc+1 ,d fc+1 ,A fe+1 



c 3 -_i,dj_i,Aj_i 



(d%_i) 

(de fc+ i) Var efe+1 (V'). 



The first term is given by the induction hypothesis. For the second term we use (4.28), to see 
that the inside integral equals 



(4.34) 



V(E) 



1 



' ^' Afc+1 (d0 fe+1 )Var 0fe+1 ^) =E Cfc+1 , dfe+1 ,A fc+1 (Var x (^)) = ,1 Var 0fc+a (^). 



?k+2 



°fc+2 



Iteration of this reasoning for the second term in (4.33) leads to 
(4.35) 



l=k+l 



z ,: ii : 



i=t+l 



+ m/ 



Var (i/>) + 



Vare(V'), 



which proves the claim. 



□ 



If A fc = A fc ([0, 1]) = 0, k € N , then ((427]) reduces to (DGV95], Eq. (6.12). Indeed, in that 
case dj+i n?=i+i i+m * s ec L ua ^ to dj+i- (Note the typo in |DGV95] . Eq. (6.12): dk should be 
replaced by dfc+i.) 

Remark 4.8. The results in this section can alternatively be inferred from the long-time 
behaviour of the spatial A-coalescent with G = {0, *}. 
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5 Strategy of the proof of the main scaling theorem 



The proof of Theorem |1.4| will be carried out in Sections [6]-[8j In this section we explain the 
main line of the argument. 

5.1 General scheme and three main steps 

In Dawson, Greven and Vaillancourt [DGV95j , a general scheme was developed to derive 
the scaling behaviour of space-time block averages for hierarchically interacting Fleming- Viot 
processes, with the interaction coming from migration, i.e., a system similar to ours but without 
A-Cannings block resampling (so for A = 5q, which results in diffusion processes rather than 
jump processes). Nevertheless, this scheme is widely applicable and indicates what estimates 
have to be established in a concrete model (with methods that may be specific to that model). 

For our model, the difficulty sits in the fact that diffusions are replaced by jump processes, 
even in the many-individuals-per-site limit. Below we explain how we can use the special 
properties of the dual process derived in Section [2] to deal with this difficulty. In Sections [6}|8] 



the various steps will be carried out in detail to prove our scaling result in Theorem 1.4 In 
these sections we will mainly focus on the new features coming from the A-Cannings block 
resampling. The refined multi-scale result in Theorem |1.8| will be proved in Section [9j This 
can be largely based on earlier work ( |DGV95] Section 4]), where the line of argument was 
developed in detail for Fleming- Viot and needs no new ideas for Cannings process: it only 
requires carrying out a new moment calculation. 

The analysis in Sections [M8] proceeds in three main steps: 



Show that for the mean-field system, i.e., G = Gjy,i = {0, 1, . . . , N — 1}, in the limit as 
JV-T-oowe obtain for single sites on time scale t independent McKean-Vlasov processes, 
and for block averages on time scale Nt Fleming- Viot processes with a resampling con- 
stant d\ corresponding to Ao and Co- With an additional Ai-block resampling at rate 
N~ 2 there is no effect on time scale t, and so on time scale Nt we obtain a C -process 
with A = diSo + Ai. This is done in Section ml 



c A 

• Consider the Cjy - -process restricted to Gn,k- More precisely, study its components and 
its fc-block averages for 1 < k < j < K on time scales A rJ + tN k . This is done in 
Section 

• Treat the (j, k) renormalised systems for 1 < k < j < K, approximating the C%r~- 
process on Qn, in the limit as N — > oo and on time scales at most N K t for a fixed but 
otherwise arbitrary K 6 N, by the process on Gm,k from the previous step. This is done 
in Section [8l 

The three steps above are carried out following the scheme of proof developed in [DGV95J. 
What is new for jump processes? The key difference is that now semi-martingales (arising from 
functionals of the process) are no longer controlled just by the compensator and the increasing 
process of the linear functional (Xt,f), where X = (Xt)t>o is the process in question and 
/ 6 Cb{E), as in the case of diffusions where linear and quadratic functions (X t ,f), (X t ,f) 2 
in T suffice to establish both tightness in path space and convergence of finite- dimensional 
distributions. The new ingredients are the analysis of the linear operators of the martingale 
problem acting on all of J 7 , and the extension of the tightness arguments to handle the jumps. 
This relies heavily on the duality relation to the A-coalescent with block coalescence. 
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5.2 Embedding 

In the proofs, we view the C|^— -process the C A -processes with G = {0, 1, . . . , N — 1}, G = 
Gn,k = {0, 1, . . . , N — 1} K and G = Qn, as embedded in a process for the choices G = N, 
G = N K and G = fioo, where 

(5.i) n OQ = \J n M c n n 

MeN 

Note that fioo is countable, but that the fiw's are not subgroups of f^oo. The embedding 
requires us to embed the test functions and the generators on £Im hito those on f^. In the 
calculations in Sections [6}(8j we use this embedding without writing it out formally. 

Proving weak convergence in path space for solutions of martingale problems with opera- 
tors acting only on functions of the current state reduces to showing convergence for gener- 
ators, combined with the compact containment condition for the path. Often, for processes 
with values in compact state spaces, the latter follows from bounds on the generator as well. 
More precisely, we have to choose a dense subset A of C^((V(E))^ , M) and show that the 
compensator terms satisfy, for all F E A 

( [\l^F)(X s )c\s) . 

We can then conclude that the limit points satisfy the desired martingale problem, after 
which verification of the well-posedness of the latter gives convergence of the finite-dimensional 
distributions. Once we can guarantee tightness in path space via the compensator convergence, 
we are done. 

The above procedure works well in the case of Fleming- Viot processes. In our case, since 
we have jump processes, an additional argument is needed for the tightness in path space. This 
argument will be based on Jakubowski's criterion, which reduces the tightness of measure- 
valued processes to collections of real- valued processes (semi- martingales) , whose tightness 
in turn can be based on the characteristics of the semi-martingales. The latter requires 
an additional argument to cope with the jumps, but is still essentially based on generator 
calculations. 

In summary, the role of Sections [6j~j8] is to carry out first some generator calculations and 
then an asymptotic evaluation of the resulting expressions, leading to a limiting form. There 
we will use an averaging principle for local variables based on the local equilibrium dictated 
by the macroscopic slowly changing variable based on the idea of local equilibria. 



(5.2) £ 



L^ G ^F((X^))ds 



t>o 



£ 



6 The mean-field limit of C -processes 

This section deals with the case G = {0, 1, . . . , iV — 1} for a model that includes mean- field 
migration and Cannings reproduction at rate 1 with resampling measure Ao in single colonies. 
We analyse the single components and the block averages on time scales t, Nt and Nt + u 



with u S M.. The key results are formulated in Propositions 6.1 and 6.3 below. We will see 
that we can also incorporate block resampling at rate N~ 2 Ai and still get the same results. 

The analysis for mean-field interacting Fleming- Viot processes with drift is given in de- 
tail in [DGV95, Section 4]. The reader unfamiliar with the arguments involved is referred to 
this paper (see, in particular, the outline of the abstract scheme in [DGV95, Section 4(b) (i), 
pp. 2314-2315]). In what follows, we provide the main ideas again, and focus on the changes 
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arising from the replacement of the Fleming- Viot process by the A-Cannings resampling pro- 
cess, i.e., the change from continuous to cadlag semi-martingales. 
We always start the process in an 



(6.1) i.i.d. random state with mean-measure E[Xq^(0)] 



9 £ V(E). 



The system will be analysed in the limit as N — > oo in two steps: (1) component- wise on 



time scale t (Section 6.1); (2) block-wise on time scale Nt and component-wise on time scale 
Nt + u with m£R (Section [Ob. 



6.1 Propagation of chaos: Single colonies and the McKean-Vlasov process 

In this section, we consider the C A -mean- field model in Section |l. 3. 2| with G = {0, 1, . . . , N — 

-i(*)})t>o hi the 



1}. We prove propagation of chaos for the collection ({Xq '(t),...,X^_ 



-0 \ L )i ■ ■ ■ 

limit as N — > oo, i.e., we prove asymptotic independence of the components via duality and 
component- wise convergence to the McKean-Vlasov process with parameters do = 0, cq, Aq, 9. 



Proposition 6.1. [McKean-Vlasov limit, propagation of chaos] 



Under assumption (6.1), for any L € N fixed, 



(6.2) C (X^ N \t),...,X^'(t)) 



(AO, 



t>0 



C 



z, 



co,do,Ao 



with Z c ft oAhko as in (fi~19b. 



Corollary 6.2. [McKean-Vlasov limit with block resampling] 

Consider the system above with an additional rate N~ 2 r\\ of block resampling. Then (6.2) 
continues to hold. 



In order to prove (6.2), we will argue that the laws C[({xi N \t), £ 



0,...,L}) t > ],iVeN, 

are tight by showing this first for components (Section |6.1.1 ) and characterise the weak limit 
points. In order to carry this out, we verify asymptotic independence (Section 6.1.2), calculate 
explicitly the action of the generator on the test functions in the martingale problem of X^ 



(Section 6.1.3), and show, for functions depending on one component, uniform converg ence t o 



the generator of the McKean-Vlasov operator with parameter 9 



E[X { N) (0)} (Section |6.1.4fr 



6.1.1 Tightness on path space in N 

We focus first on one component (X^(t))t>o- We use Jakubowski's criterion for measure- valued 
processes (see |D93| Theorem 3.6.4]). This requires us to show: (1) a compact containment 
condition for the path, i.e., for all e, T > there exists a Kte compact such that 



(6.3) F({X ( , N \t) G K T , e for all t £ [0,T]}) > 1 



(2) tightness of certain evaluation processes (F(X^ N \t)))t>o m path space. In our case the 
compact containment condition in (1) is immediate because we have a compact type space. 
Condition (2) can be verified by using a criterion for tightness of Kurtz (see Dawson [D93 ; 
Corollary 3.6.3]). Here, we use test functions as in (1.9) that only depend on the first L 



coordinates. We further make use of the boundedness of the characteristics of the generator 
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as a function of N when acting on a test function (recall (1.38)). Namely, we will see in 
Sections 16.1.31 that 

(6.4) sup WL^FWoo < oo, for all F G F. 

This allows us to verify that linear combinations of 

(6.5) {C[((xf\t)J) n ) t > ]: nen,feC h (E,R)} 
are tight, which verifies the tightness criterion in (2). 



In order to verify (6.5), we use a criterion for the tightness of semi-martingales based on 



the local characterisation via the Joffe-Metivier criterion ( [ JM86] . see also [D931 Theorem 
3.6.6.]). 

6.1.2 Asymptotic independence 

In this section, we use duality to prove the factorisation of spatial mixed moments (including 
the case with block coalescence at rate iV -2 Ai): 



(6.6) limsup 

7V-s>oo 



E 



e=o 



e=o 



0, for all t > 0. 



A similar result holds for mixed moments over different time points. 



Proof of (6.6). Obviously, no block coalescence takes place in the time interval [0, T] in the 



limit as N — > oo. We verify the remaining claim by showing that any two partition elements 
of the dual process never meet, so that for n partition elements none of the possible pairs will 
ever meet. Indeed, the probability for two random walks to meet is the waiting time for the 
rate-2co random walk to hit 2 starting from 1. This waiting time is the sum of a geometrically 
distributed number of jumps with parameter iV _1 , each occurring after an exp(2co)-distributed 
waiting time. By explicit calculation, the probability for this event to occur before time t is 
0(N~ 1 ), which gives the claim. □ 

6.1.3 Generator convergence 

In order to show the convergence of LWF, we investigate the migration and the resampling 
part separately. 



• Migration part. Recall from (1.12) that the migration operator for the geographic space 
G = G Nt i = {0,1,..., iV- 1} is 

(6.7) (42F)(x) = | £ J^-^^IS,,], 

where F G T C C\)(V{E) N , R), with T the algebra of functions of the form (1.9). We rewrite 



(6.7) as 



(6. 



(C)M = <* E I ^ E (*c-* S )(da)?g%, 



dF(x) 
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where y = N 1 J2^=o X C = N 1 J2(^g n i X C denotes the block average. We will show that, in 

the limit N — > oo, (L^^F)(x) only depends on the mean type measure 9 of the initial state, 
i.e., it converges to 

8F(x) 



(6.9) (L«F)(x 



c Yl 



(0-xz)(da)- 



dxf 



•[8a], 



where we use for this generator acting on Cb(V(E))^, M.) the same notation we used for the 
McKean-Vlasov process with immigration-emigration on V{E). Furthermore, we show that 

(6.10) 9 i — y L C 9 °F G C h (V(E),R) is continuous for all 9 G V{E). 
To show the convergence, define 

(6.11) M = \x G (V(E)) N : N' 1 V x. — > fl) C (P(^)) N , 



and 
(6.12) 



U i 



If we have an i.i.d. initial law (respectively, an exchangeable law) with mean measure 9, then 
the process XW satisfies 

(6.13) £[X( Ar )(t)](B) = 1 (respectively, C[X^ N \t)(B e )} = 1). 



Indeed, as we will see in Section 



6.2 



(recall (1.40)) evolves on time scale Nt. More 

precisely, (Y^'(tN)) t >o is tight in path space and therefore converges over a finite time 
horizon to the mean type measure 9 of the initial state. In a formula (the right-hand side 
means a constant path): 

(6.14) C[(Y { P(t)) tmT] ] => C[(9)telo,T]]- 

Therefore, on B# we have 



(6.15) \{L^F){x)-(LfF)(x)\-^ 0, 
Hence, on the path space, we have 



for all x G Mg, 



(6.16) £ 



J (L^lF)(X N (s))ds- J (L^ N){s F)(X N (s))d t 



t>o 



TV-s-oo 



So- 



• Resampling part. The action of the resampling term on each component (recall (1.14)) 
does not depend on N and hence we obtain, by the law of large numbers for the marking 
operation (recall that F depends on finitely many coordinates only) 

'A p\/„M , n f 11 „, r- tT>tT?\^ 



(6.17) \(lWf)(x) 
where 
(6.18) 



(L A °F)(x)\ — > 0, for all x G (V(Ey 

N— >oo 



(L A °F)(x) 



[0,1] 



A*(dr) / x^da) 



F(x Q , 



, (1 - r)x$ + rS a , x^+i, . . . , xat_i) - F(x) 
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6.1.4 Convergence to the McKean-Vlasov process 

In what follows, we fix £ G No and let 



(6.19) G(x f ) 



xf n (du) ip(u) 



n G N,y> E C b (£ n ,M). 



We know that (X^°(t)) e6 n is tight and that all weak limit points are systems of independent 
random processes ("propagation of chaos"). It remains to identify the unique marginal law. 

Let the initial condition (xi (0))^ e N be i.i.d. V(E)-valued random variables with inten- 
sity measure 9. Then each single component converges and the limiting coordinate process 
has generator 



(L%»°' Ao G)(xz) =c 



(6.20) 



g{)(do) dG J?*\ 5 a ] 



dx 



+ 



[0,1] 



A* (dr) [ x^da) [G((l - r)x£ + rS a ) - G?(x e ) ] , 

JE 



where 9 G V{E) is the initial mean measure. Indeed, we may now reason as in [D93, the 
secon d part of Section 2.9]. Tightness of the processes {X^ N > (t))t>o was shown in Section 
Fix £ G No and consider a convergent subsequence (X^ Nk \t))t>o, k G N. We claim 
the limiting process is the unique solution to the well-posed martingale prob lem with 



6.1.1 

that 

corresponding generator Lq*' ^ and initial condition 9. Recall from Section 
all test functions F G F, 



6.1.3 



that, for 



(6.21) C 



(L%l + L£l)(F)(X»(s))ds 



C 




co,rfo,Ao 



{X™{s))ds 



t>0 



in the sense of processes. Hence all weak limit points of X^ solve the Ln°' do ' A ° -martingale 
problem of Section 1.3.3 The right-hand side is the compensator of a well-posed martingale 
problem, and hence we have convergence. 

6.2 The mean-field finite-system scheme 

In this section we verify the mean- field "finite system scheme" for the C A -process, i.e., 
we consider L + 1 tagged sites {Xq (t ),•■■, Xj^\i)} and the block average 

N' 1 EgeGjv,! x £ (t) and we P rove: 

• convergence of (Y^ N \Nt))t>o to the Fleming- Viot diffusion Y(t) = Zg' dl '°(t) with pa- 
rameter <ii = ; and initial state 9; 



2c +A 



convergence of the components ({X^ N) (Nt + ti),£ = 0, . . . ,L}) u >o to the equilibrium 
;ess with imm: 
, Co ,d ,Ao ( recall ([XT])) with 9{t) = Y{t) (recall that d = 0). 



McKean-Vlasov process with immigration-emig ration {Z^ oM (u)) u >0 starting from 



distribution u 



Bit) 



Proposition 6.3. [Mean-field finite system scheme] 

For initial laws with i.i.d. initial configuration and mean measure 9, 



(6.22) C[(Y( N \Nt)) t>0 ] => C[{Z ( 



0,<fi,<5 



(<)) 



t>0\ 
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with d\ = 2co+a ' Moreover, for every a£l and L € N, 



(6.23) C[(Xf\Nt + u)\ =0 ,...,L} 



c ,d ,A 
u ft> 



'jL+l 



with Pt = C\Z, 



0,(ii,<5o 



(*)]■ 



V{E) 



Corollary 6.4. [Mean-field finite system scheme with Ai-block resampling] 

Consider the model above with additional block resampling at rate N~ 2 K\ . Then, in the right- 
hand side of (6.22), Za' dl ' S ° mus f ^ e re pl ace d by Za'^ 1 '^ 1 , and similarly in the definition of Pt 



in (6.23) 



The proof of the mean-field finite system scheme follows the abstract argument developed 
in |DGV95| . Namely, we first establish t ightne ss of the sequence of processes (Y^ N \Nt))t>o, 
N £ N, which can be done as in Section 



6.1.1 



for (xW(t), 



^f(t))t>o, 



JV£N, once we 



have calculated the generators. A representation for the generator of the process is found in 



Sections 6.2.1-6.2.2 below. This can be pursued, with the help of the idea of local equilibria 



based on the ergodic theorems of Section [4j to obtain first (6.23) and then (6.22). 

In Sections 6.2.1 6.2.2, we calculate the action of the generator of the martingale problem 
on the test functions induced by the functions necessary to arrive at the action of the generator 



of the limiting process. In Section 6.2.4, we pass to the limit N — > oo, where as in Section 6.1 



we have to use an averaging principle. However, instead of a simple law of large numbers, 
this now is a dynamical averaging principle with local equilibria for the single components 
necessary to obtain the expression for the limiting block-average process. 
By the definition of the generator of a process, M X,F 



(6.24) Mp F = F{x t ) - F(x ) 



ds 



L (N) F + L (N) F 

mig ^rps x 



-(M t 



)t>0, 



is a martingale for all F, as in ( |6.19 ). The same holds with x replaced by the block averages 
y (by the definition of y). Once again, we will investigate the migration and the resampling 
operator separately, this time for the block average. 



6.2.1 Migration 



In this section, we consider functions F o y with F as in (6.19) and y a block average (which 
is equivalent to choosing F as in ([L9J) with N = 1). We will show below that ^(Foy) = 0, 
so that migration has no effect. 

Recall {L^\ g F){x) as rewritten in (6.8). For the block averages y the migration operator 
can be calculated as follows. Since y = y(x) and F(y) = (F o y)(x) can be seen as functions 
of x in the algebra J- of functions in x of the form (6.19), we have 



(6.25) (L^lF)(y)=(L^l(Foy)y x )= £ c f (y 
For y = N~ x Y,£eG N j H this y ields 



^ d(Foy)(x) lx 



(6.26) ^[.] 



dxt 



dF(y) 
dy 



6a 
N 



and hence 



(6.27) (L^F)(y)= £ c f (y-x e )(do) 



dF{y) 
dy 



S_a 
N 



0. 
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6.2.2 From A-Cannings to Fleming- Viot 

Next, we evaluate the moment measures of the average in the limit as N — > oo and show 
convergence of the terms to the Fleming- Viot second order term. We denote by L^?' 1 ^ the 
generator Lres on time scale Nt. 

Lemma 6.5. [Generator convergence: resampling] 

On time scale Nt, in the limit as N — >■ oo, 



(6 ' 28) =^ E / 01 W / ^N^l^f + Ur(-i{ + «.)] + 0(iT 1 ). 



Proof of Lemma 6.5. We first rewrite F(yt) in terms of a^: 
F(»l) = <V,!/f"> = (v, ' ' ~~ 



(6.29) 



iV 
1 



£lGGjV,l £nSGjV,l 

n 



E I (V.^i(*)®-"®^„(*))- 



Abbreviate 

(6.30) F (?1 -" e ' l) (x) 



du W 



\ i=i / 



Note that, in this notation, = £j for i 7^ j is possible. Recall that (xt)j>o has generator 
and is the unique solution of the martingale problem (6.24). If we use (6.29) in (6.24) with x 
replaced by y, then we obtain that (yt)t>o solves the martingale problem with generator 



(6.31) (L^F)(y) = ± 



E Im^ 1 -^ 



for the resampling part. Together with (1.14) this yields the expression 
(6.32) 

(4? s ) F)(y) = 4n\<g> E ) E / AJ(dr) / a; € (da) 

x F^'-'^\x , (1 - r)x 5 + r5 a , x t+1 , x N -i) - F^'-^ n \x) 

We must analyse this expression in the limit as A" — >• 00. To do so, we collect the leading order 
terms. The key quantity is the cardinality of the set {£1, . . . , £ n }, for which we distinguish 
three cases. 
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Case 1: |{£i, . . . , £n}| = n, i.e., all £j, 1 < i < n are distinct. 



The contribution to (6.32) is zero. For £ . . . ,£„} this is obvious by the definition of 



(x) in ( |6.30[ ). Otherwise, we have 
x 5 (da) \f^'-'^\xq, (1 - r)x 5 + r<5 a , x^+i, . . . , x N -i) - F&>~'^( 



x^(da) 



(6.33) 



tpjXfr®--® ((1 - r)x ? + rS a ) ®---®x^ n )-{ip,x^®---® x^J 



only change (unique) 
position with £i=£ 



0, 



where in the last line we use that {xg, 1) = 1. 
Case 2: {{&,..., £ n }\ <n- 2. 

The contribution to (6.32) is of order N~ 2 . Indeed, the contribution is bounded from above 
by 

1 



( 6 - 34 ) jvnl® E ) 1 {lte,..,f„}|<n-2}AoC F = iV- 2 A C F , 



i=l &eGjv,i. 



where Cj? denotes a generic constant that depends on F only (as in (6.19)), and thereby on 



ip and n. Here we use (1.38) and the fact that the sum ^ eGftri yields at most n non-zero 
summands by the definition of F^ 1 '"'^ 71 ' (x) in (6.30). 
Case 3: . . . ,f n }| =n-l. 

There exist 1 < mi < m% < n such that £ mi = £ m2 while all other £j, 1 < z < n, are different. 



By the reasoning as in (6.33), we see that the only non-zero contribution of the sum Y1(£G N j 
to the generator in (6.32) comes from the case where £ = £ mi = £ m2 . We therefore obtain 



(6.35) 



wA® E ) iflttx.-AJi^-i} E W^ 2 =o/ n /° (dr) £^ (da) 

\i=l^eGjv,i/ Kmi<m 2 <n ^ U ' i J ^ 

F^>-^(xo, • • • , (1 - r)a* + r5 a , x m , . . . , xjv-i) - F^'-'^(x)] + 0(N- 2 ) 



Reasoning similarly to (6.34), we see that extending 



(6.36) ( <g) E hm,:,M\=n-i } E 1 {^=U 2 } 

i=l^eG]v,i/ Kmi<iri2<n 



in (6.35) to 



( 6 - 37 ) E E ® E 

l<mi<m 2 <ng mi GGjv i i \ie{l,...,Ti}\{mi,ma} fi6Gjv,i, 
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only produces an additional error of order N 2 . Using this observation in (6.35), we get 



J2 E Z 



N 2 



[0,1] 



A* (dr) / x c (da) 



(6.38) 



</>,y£i®'"® (i 1 - r)x£ + rS a ) 
\ v " ' 

only change position £ mi 

(^P,y^ ® • • • ® X£ <g 

only change position £ mi 

0(N- 2 ) . 



(g> ((1 - r)x 5 + rS a ) <g> ■ ■ ■ ® y ? „ 
v ' 

and position £ m2 
and position £ m2 



Now use that 



3-39) j^x^da) ((p,y& 



(aif) ® • • • (8) (-rx^ + r<5 a ) (g> • • • ® \ =0 



only change 
position £ mi 



and position £ m2 
for mi,rri2 fixed 



to obtain from ( |6.38[ ) that 
(L^F)(y) 

N2 E E 



l<mi<m 2 <n^Giv,i l ' 1 ' 

(6.40) x /y.yei®"-® (r(-xg + £ Q )) 

+ o(iv- 2 ) 



A^(dr) / x 5 (da) 



only change position £ mi 



and position 



= ^2 £ J i K(dr)J E X^da)l g ^[r(-x^ + S a )M-^ + S a )] + 0(N- 2 ). 
Comparing Cases (l)-(3), we see that only the latter contributes to the leading term. Chang- 



ing to time scale Nt in (6.40), i.e., multiplying L^J by N, we complete the proof. 



□ 



6.2.3 A comment on coupling and duality 

The techniques of coupling and duality are of major importance. One application can be 
found in |DGV95| Section 4], namely, to prove Equation (4.17) therein. The key point is 
to obtain control on the difference between C[Zt] and C[Z^\ for two Markov processes with 
identical dynamics but different initial states. Such estimates can be derived via coupling 
of the two dynamics, or alternatively, via dual processes that are based on finite particle 
systems with non-increasing particle numbers, allowing for an entrance law starting from a 
countably infinite number of particles. Both these properties hold in our model. This fact is 
used to argue that the configuration locally converges on time scale Nt to an equilibrium by 
the following restart argument. 
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At times Nt and Nt — ijv, with limjv->oo tjy = oo and limjv-s>oo t^/N = 0, the empirical 
mean remains constant. Hence, we can argue that, in the limit as N — > oo, a system started 
at time Nt — tjsr converges over time tw to the equilibrium dictated by the current mean. Two 
facts are needed to make this rigorous: (1) the map 6 i— > Vn' d ' must be continuous; (2) the 
ergodic theorem must hold uniformly in the initial state. Both coupling and duality do the 
job. This is why we can work with both in |DGV95j . 

6.2.4 McKean-Vlasov process of the 1-block averages on time scale Nt 



Recall the definition of the Fleming- Viot diffusion operator Q in (1.18) and the equilibrium v 



of the McKean-Vlasov process in the line preceding (4.1). In what follows we keep to denote 
by L^J^ the generator Lra? on time scale Nt. Observe that the compensators of M X)F , see 



(6.24) are functionals of the empirical measure of the configuration. The set of configurations 



on which concentrates in the limit as N — > oo turns out to be 

c ,0,A 



( 1 N 

(6.41) 3* e =M e nUe(V(E)f: -J^S 



iV— >oo 



where 9 is called the intensity of the configuration and 

(6.42) B* = (J M* e . 

eer(E) 

Lemma 6.6. [Local equilibrium] 

(a) The block resampling term satisfies, with y the intensity of the configuration x for x 6 
B*, 

lim (L^F)(y) = ^ / ^°^(dx) [ [ Q~(du,dv) ^f^[5 u ,5 v ] 
/->-°° 2 J V(E) y J E J E dydy 



(6.43) 

c A f f n f , , v d 2 F(y) 

Qy(du,dv) aa ^ [o u ,d v \. 



2c + A J E Je dydy 

(b) If the system starts i.i.d. with some finite intensity measure, then every weak limit point 
of C[(X( N \Nt + u)) ug R] as N — > oo has paths that satisfy 

(6.44) P(X (oo) (t,n) £B*) = 1 Vi G [0,oo), a£K. 
Proof, (a) The proof uses the line of argument in |DGV95[ Section 4(d)] (recall the comment 



in Section 6.2.3 ) , together with ( 4.20 ) and the definition of Q. In what follows two observations 



are important: 



(i) We use the results on the existence and uniqueness of a stationary distribution to (6.20) 
on the time scale t with iV — > oo, including the convergence to the stationary distribution 
uniformly in the initial state, combined with the Feller property of the limiting dynamics 



(see Section|4j). Note, in particular, that with (4.20) we get the second assertion in (6.43) 
from the first assertion. 

(ii) We use the property that the laws of the processes (Y( N \Nt))t>o, N € N, are tight in 
path space. 
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The combination of (i) and (ii) will allow us to derive the claim. 



To verify (ii), use (6.40) to establish boundedness of the characteristics, which gives the 



tightness through a standard criterion (Dawson |D93, Theorem 3.6.6]). To verify (i), we want 
to show that the weak limit points satisfy the (Sg, -L ' dl,<5o )-martingale problem. For that we 
have to show that 



(6.45) 



F(YW(tN)) - F(YW'N(0)) - / (L^^F)(Y^(sN))ds 

o / t>0 

t 



F(Z°- dl '°)(t)) - F(0) - J (L°' dl ' F)(Z ' dl ' ( S ))ds 



t>o 



In order to do so, we first need some information on L^'W. Since we are on time scale Nt 
with N — > oo, we get 



lim (L™F)(y) 

AT— >oo 



(6.46) 



[04] 

Ao 
2 



A* (dr) 



v 



cq,0,Ao 



V(E) 



V 



(dx) 



co,do,Ao 



V(E) 



(dx) 



x(do) 
E oydy 



,l d 2 F(y) 

e 2 oydy 

d 2 F(y) 



r(-x + 5 a ),r(-x + 6 a ) 



[-x + S a , -x + 5 a ] 



\/xeM*yeV(E). 



Use the definition of the Fleming- Viot diffusion operator Q from (1.18) to obtain the claim in 



(b) To show that the relevant configurations (under the limiting laws) are in B*, we use a 
restart argument in combination with the ergodic theorem for the McKean-Vlasov process. 
Namely, to study the process at time Nt+u we consider the time Nt+u—t^ with lim7v-s>oo ijv = 
oo and limAr-^oo tjy/N = 0. We know that the density process Y^ N > at times Nt + u — tN and 
Nt + u is the same in the limit N — > oo, say equal to 9, and so over the time stretch t^ the 
process converges to the equilibrium (^ ' ' A °)® N . By the law of large numbers this gives the 
claim. Therefore all possible limiting dynamics allow for an averaging principle with the local 
equilibrium. □ 



Conclusion of the proof of Proposition 6.3 



Recall from (6.27) that migration has no effect. Lemma 6.6 shows the effect of the block 



resampling term on time scale Nt for N — > oo. Adding both effects together, we have that all 
weak limit points of C[(Y^ N \Nt)) t>o]> N 6 N, satisfy 



(6.47) the (5g, Lg' dl '"^-martingale problem with d\ 

7 Hierarchical C A -process 



cqAq 

2c + A c 



The next step in our construction is to consider finite spatial systems with a hierarchical 
structure of K levels and to study the /c-block averages with k = 0, 1, . . . , K on their natural 
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time scales N k t and N k t + u. This section therefore deals with the geographic space 

(7.1) G = G N , K = {0,1,...,N- 1} K , N,K G N. 

Define the Cannings process on Gif jc by restricting the Cannings process X^ N ' to Bk{0) 
and putting 

(7.2) c fc , A fc = 0, for all k > K. 

The corresponding process will be denoted by X^ N ' K ' and its generator by L^ N ' K \ etc. It is 
straightforward to also include a block resampling at rate N~ 2K with resampling measure Ak 



(compare Corollary 6.2) 



In this section our principal goal is to understand how we move up < k < K levels when 
starting from level 0. However, in order to also understand a system with k levels starting 
from level, say, L and moving up to level L + k, we will add a Fleming- Viot term to the 
generator of X^ N \ i.e., we consider the case do > 0. We do not need to add Fleming- Viot 
terms acting on higher blocks. In addition, we include a Fleming- Viot term with volatility 



do > 0. As we saw in Lemma 6.6, a resampling term can result, on a higher time scale and 



in the limit as N — > oo, in a Fleming- Viot term. For instance, if we choose do = in the 

cqAq 
2c +A 



beginning, then we obtain d\ = 2 C0 4°\ > on time scale Nt for the 1-block average (recall 



(6.47D). 

We look at the block averages on space scales N k and time scales N k t with k = 1, . . . , K. 
In Section |7.1| we will focus on the case K = 2, where most of the difficulties for general K 
are already present. For K > 2 lower order perturbations arise, which we will discuss only 
briefly in Section 7.2 because they can be treated similarly as in |DGV95j . In Section |] wc 



will take the limit K — > oo and show how this approximates the model with G = Qn on all 
the time scales we are interested in for our main theorem. 



7.1 Two-level systems 

The geographic space is Gn,2 = {0, 1, . . . , N — l} 2 , we pick do, Co, c\, Aq, Ai > and put c&, A& 
to zero for k > 2. We will prove the following: (1) On time scales t and Nt we obtain the 
same limiting objects as described in Section [6j but with an additional Fleming- Viot term 
(do > 0) and with block resampling via Ai; (2) On time scale Nt for 1-block averages (each 
belonging to an address r/ £ {0, 1, • • • , N — 1}) we introduce 

(7.3) vf)(t) = iV- 1 £ X^(t). 

Next on time scale N 2 t we consider the total average 

(7.4) #>(i) = r 2 *?°(*)> 

we get a similar structure. 

Namely, we can replace the system (Z^ N \ X^) for N — > oo by a system of the type 
in Section [6j with the role of components on time scale t taken over by 1-block averages on 
time scale Nt and the role of the total (1-block) average on time scale Nt taken over by the 
2-block average on time scale N 2 t. Once again, we only focus on the new features arising in 
our model. The general scheme of the proof for the two-level system can be found in [DGV95, 
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Section 5(a), pp. 2328-2337]. The calculations in Sections 7.1.1 7.1.3 correspond to Steps 4-5 



in [DGV95, Section 5(a)], with the focus now shifted from the characteristics of diffusions to 
the full generator because we are dealing with jump processes. 

Proposition 7.1. [Two- level rescaling] 

Under the assumptions made above, 



jV->oo 



N-too 



(t)) t >o] VCeGW, 2 , 



ci,di,A 



(7.5) £[(X { ( N) (t)) t > ] =► c[(Z^ doAo 
and 

(7.6) C[(Y^ N) (Nt)) t > } -- C\(Z. 
and ( with A = 5q) 

(7.7) C[(Z( N \N 2 t)) t > ] => C[(Z /^°(t)) t > ] with d 2 



1 (*))t>o] with di 



c (Aq + 2d ) 
2c + Ao + 2d ' 



ci(Ai + 2di) 
2ci + Ai + 2di 



The proof of (7.5-7.7) is carried out in Sections 7.1.1-7.1.3 



7.1.1 The single components on time scale t 

In this section, our main goal is to argue that the components of change on time scale 

t as before, and that the same holds on time scales Nt + u and N 2 t + u with provided 
we use the appropriate value for the 1-block average as the centre of drift. 

We first look at the components on time scale t. Due to the Markov property and the 
continuity in 9 of the law of the McKean-Vlasov process, the behaviour of the components on 
time scales Nt+u and N 2 t+u with u 6 K is immediate once we have the tightness of Y^ N ^ and 
Z( N ) on these scales. Again, our convergence results are obtained by: (1) establishing tightness 
in path space; (2) verifying convergence of the finite-dimensional distributions via establishing 
asymptotic independence and the generator calculation for the martingale problem. Since the 
latter is key also for the tightness arguments, we give the analysis of the generator terms first. 



In fact, the rest of the argument is the same as in Section 6.1 



Migration part. Consider the migration operator in (1.36) and (1.25) applied to functions 



F £ J 7 , the algebra of functions in (jl .33[) . The migration operator can be rewritten as 

dF{x) 

?,CeGjv, 2 



dxt 



-[5a] 



E E c k-iN l ' 2k [ (x c - x € ) (da) 



(7.8) 



5,CeGjv, 2 d{i,C)<k<2 



dF(x) 
dx 



E E^ 1 - 2 * E j^c-H)^) 9 ^ 1 ^] 



E E^-i^ 1- " / fe-^)( d 



?eGiv,2 k<2 



dF(x) lx 
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where we use (1.30) in the last line. Thus, for F as in ( |1.33 ), we obtain 
(7-9) (L^ 2) F)(x)= £ co/ (da) ^[* a ] 

where 

(7.io) |^; (iV) | < n~ 1 Ci c f = o{n- 1 ) 

with Ci? a generic constant depending on the choice of F only. Here we use that, by the 



definition of F in (1.33), the sum over £ E Gjv,2 is a sum over finitely many coordinates only, 
with the number depending on F only. 



Resampling part. Consider the resampling operator (L^' 2 ^ F)(x) in (1.37). We have 



(7.11) (L^F)(x) = Y, I [ xs(da)[F($ r>a>Bo{0 ( x ))-F(x))+EW 



with 



(7.12) |£ (A °| < N~ 2 [ Al(dr)C F r 2 N = C F N~ 1 X 1 = 0(N~ 1 ) 

J[0,1] 



Here we use (1.38) in the first inequality, together with the fact that F(^ raBl ^(x)) — F(x) 
is non-zero for at most C F N different values of £ E Gn,2- 

Additional Fleming- Viot part. Recall that in this section we consider the case do > 0, 
i.e., we add the Fleming- Viot generator 



(7.13) (L ( $ 2) F)(x)=d E / / Qn( du > dv ) 

t/~ ft J E J E 



d 2 F(x) 
dx^dx^ 



Contrary to the migration and the resampling operator, the Fleming- Viot operator does not 
act on higher block levels. 



parts (7.11) and (7.12), and the Fleming- Viot part ( 7.13| ), we obtain 

(L^F)(x)= £ co/ (y tl - x, ) (d^ 9F{X) 
e T7T Je 



The resulting generator. Combining the migration parts (7.9) and (7.10), the resampling 

'[*.] 



dx^ 



(7.14) 



+ E / A o( dr ) /^M[^Ka A ( 9 W)-^ 



+ d £ 



The order term is independent of x 




E JE 



Q x Jdu,dv) 



d 2 F\ 



x 



dxcdx. 



[S u ,S v ] + 0(N 



— 1> 
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Convergence to McKean-Vlasov process. We can use (7.14) to argue that 
(7.15) \\L^F - L^^FWoc < CpN- 1 . 



Next, following again the line of argument in Section 6.1 we obtain that converges as a 

process to the McKean-Vlasov limit, which is an i.i.d. collection of single components indexed 
by Nq with generator 



(7.16) 



+ ( A* (dr) [ x ( (da) [G((l - r)x^ + r5 a ) - G(x £ ) " 

J [0,1] JE 

+ d / Q x Jdu,dv) 
Je Je 



d 2 G{x) 
dxtdxc 



[S u , S v ], 



where 9 6 V(E) is the initial mean measure. This completes the proof of (7.5) 



7.1.2 The 1-block averages on time scale Nt 

Again, we need to prove: (1) tightness in path space of (Y^ N \Nt))t>o; (2) convergence of 
finite-dimensional distributions via asymptotic independence and generator convergence. As 
we saw in Section [6j the latter is also the key to tightness. Therefore we proceed by first 
calculating the generator of 1-block averages on time scale Nt and then using this generator 
to show convergence of the process. At that point we need that the average over the full 
space remains 9, in the sense of a constant path on time scale Nt. The latter property will 



be proved in Section 7.1.3 



Basic generator formula. We proceed as in Section 6.2.1 Since G = Gn,2, the 1-block 



averages are now indexed too. We use the following notation for the indexing of 1-block 



averages. Recall the notation = N^ 1 Yl^eBUQ x i f rom (1-30), which is the 1-block around 
(. This 1-block coincides with the 1-block around £ if and only if d(£, £) < 1. To endow every 
1-block with a unique label, we proceed as follows. Let <j> be the shift-operator 

(7.17) 4>: G NyK -> Gjv^-i, m)i = < % < K - 1, K e N. 

We consider the evolution in time of the 1-block averages indexed block- wise, i.e., 



(7.18) = iV- 1 



where once again we suppress the dependence of yifi on N. Note, in particular, that 
(7.19) y^i = yW f or a n £ such that = rj. 

We will often drop the superscript [1], especially when the context is clear. 

This time, we consider functions F G T (see (1.33)) applied to = y[ 1 l(x), where 



(y\ i Wgjvi- By explicit calculation of the different terms below, we will obtain the 
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following expression: 



(L^F)(y) = (L<g >N +£ (JV>2)[11 



(7.20) 



»?6Gjv,i 
9 



J res,0 

gg(y) [ 



(-F0O/) 



+ E^ E 
+ E 



[0,1] 



A5(dr) / * € (da)- 



1 d 2 F(y) 



2dy v(m) dy 1 



(rn) 



[r(-X£ + 5 a ),r(-X£ + 5 a )\ 



[0,1] 



Aftdr) / ^(da)[F(* r , a ,„(y))-F(y)] 



??eGjv,i f: ^=77 



Qa.*(dit,du) 



9 2 ^(y) 

dyrjdyr, 



[S^S^+OlN- 1 ). 



Convergence to McKean-Vlasov process. We first argue how to conclude the argument, 
and then further below we carry out the necessary generator calculations. 

We have to argue first that the N different 1-blocks satisfy the propagation of chaos 
property (recall (6.6)), where we had this for components. The proof again uses duality, 
namely, dual particles from different 1-blocks need a time of order N 2 to meet and hence do 
not meet on time scale Nt. We do not repeat the details here. 

Once we have the propagation of chaos property, it suffices to consider single blocks, which 
we do next. We have to verify tightness in path space and convergence of the finite-dimensional 
distributions. As we saw before, this reduces to convergence of the generator on J- by the same 
tightness argument used in Section 6.1.1 but now based on (7.20). Consider the resampling 
and Fleming- Viot parts of the generator separately. 

Reason as in the proof of Lemma 6.6 to see that 

(7.21) 



lim.(LS )[1] F) (y ) 



1 1 

N— ion ±-~< N ±-~< 

Ao 



iV->oo N 

m =l f: </>£=r;( m ) 



[0,1] 



Ao(dr) / x^(da 



I d 2 F(y) 
2dy v(m) dy r 



(m) 



[r(-X£ + 6 a ),r(-X£ + 5 a ) 



E 

coAo 



V(E) 



2c + A + 2d 



,co,rfo,Ao 



E 

ieN 



(dx) I [ Q x {du,dv)p-^[5 U ^ V ] 
Je Je oy v dy v 

d 2 F(y) i 




E JE 



Q Vv (du, dv) 



dy v dyrj 



[5 U , S v ], 



where by (4.20) the second assertion follows from the first. Similarly, we have 

d 2 F(y) 



(7.22) lim (L^F)(y) = d J2 [ 



'V(E) 



co,d ,A 



(dx) 




E JE 



Q x (du,dv) 



dy v dy v 



Using (4.20) once more, we get 

2c d 



(7.23) r.h.s. (7.22) 



2c + A + 2d 



^JeJe dy v dy v 
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Combine (7.21 ) with (7.23) and argue as in Section 6.1.4, to see that each single component 
of the 1-block averages y = j/M = (ylj )neG N i converges and the limiting coordinate process 
has generator 



(L c ^G)(y n ) = Cl [ (e-y v )(da 

JE 



dG(y v ) 



9y r , 



(7.24) 



+ di 



Q y (du,dv) 



E JE 



d 2 G(y) 
dy^dyr. 



[t>u,6v] 



[0,1] 



AJ(dr) / y„(do) [G((l - r)y n + r5 a ) - G{y v ) 



c (X Q +2d ) 



. At this 



where 6 S V(E) is the initial mean measure of a component and d\ — 2co+ A +2d 
point we use that the average over the complete population remains the path that stands still 
at 8 on time scale Nt. 



Generator calculation: proof of (7.20). We next verify the expression given in (7.20). 
We calculate separately the action of the various terms in the generator on the function F. 
In what follows a change to time scale N k t is denoted by an additional superscript [A;]. 



Migration part. Recall (L^^ F)(x) from (7.8). Proceeding along the lines of (6.25 
we get 



6.27) 



(L^fF)(y)= £ Y.c^N 1 -" [ (ifc fc - xs)(dc 



d(Foy)(x) 



t£G Ny2 k<2 



[8 a 



E E^W (y^-x,)(da 

-n 1.^1 J E 



(7.25) 



dF{y) 
dy^ 



5a 

N 



n E E C ^W (y^-vn)^ 



r)EG N l k<2 



r?eG]v,ife<l 



y v ) ( d °) 



dF(y) 
dy v 



0F(y) 
dy v 

6a 

N 



5a 
N 



Next, for functions F that are linear combinations of functions in (1.33), we have 

9F(y) 



(7.26) N 



dF(y) 
dy v 



5a 

N 



dy r . 



-[5a]. 



On the time scale Nt, we have 

(7.27) (L^F)(y) = £ Cl / (yg-^Cda 



dF(y) 
dy n 



[5a]. 



Resampling part. The calculations proceed along the same lines as in Section|6.2.2l Apart 
from an additional higher-order term, the main extension is that we consider F(yt) = F{y\ 



[lb 



(<P, ®?=i O with V = y [1] = (^Wl ?7 {0 G g € {1, . . . , N} and € N, 1 < Z < ?, 
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instead of (6.29) (which corresponds to the case q = 1). We will restrict ourselves to functions 



F of the form 

f(v) = 



(7.28) 



<p{U 



(1) 



, u 



(?) 



), y=(y v ) r , eGNA eV(E) N , 



J=i 



q G {1, . . . , N}, m G N, tjW G I G {1, . . . , q}, 

r/0 / n {l '\ for all I / r, «W G 99 G C b (£ ni+ - +r \ J 



The only difference with (1.33) is the restriction of the ordering of the entries. This facilitates 
the notation in the computation below, but is no loss of generality because the set of functions 
in ( 7.28[ ) generates the same algebra T . We will now show that 



E^ 



[04] 



AS(dr) / x 5 (da) 



1 d 2 F(y) 
2dy ri{m) dy v ( m) 



(7.29) m=1 

+ E / Ai(dr) / y,(da)[F(cD riai?? (y))-F(y)]+0(iV- 1 



[r(-X£ + 6 a ),r(-X£ + 5 a )] 



Recall the notation in (6.30) and set 

i 



(7.30) L = Y,ni- 



1=1 



Proceeding as in (6.29 6.31), we obtain 

(7.31) (L^F)(y) = ±; ((g) <g) £ 

\1=1 i=l £i. <^ =7? m, 



ires 



(*). 



As in Section |6.2| we distinguish between the different cases for the structure of the set 
{£l> " " " >£n } an< ^ we obtain, using the definition of the resampling operator in (1.37), 



(L^ 2) F)(y) 



1 / 



1 n t 



<g>® E E 

\l=l i=l gj : ft\= n m/ 56GW, 2 



[04] 



A$(dr) / ^(da) 



(7.32) 



(dr) / ^(da) 



+ ^(<8><8> E ) E ^ 2 / ol] Ai 

x >(^.-.^)(* r)a>Bl(0 (x)) - F (^-^)(x) 
= Io + h- 

For the first term Jo in (7.32) we proceed along the lines of (6.33 (T34]) to conclude that the 
only non-negligible contribution to the sum in Iq comes from terms with 1 < I < q, 1 < 
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i < n i}\ = L—l. It remains to investigate the terms with 1 < I < q, 1 < i < nz}| = L — 1. 
Since cj)^\ = 77®, this implies that there exist 1 < m < q and 1 < m\ < m2 < n m such that 
£mi = £m 2 an< ^ an other different. By the same reasoning as in (6.33), we see that the only 
non-zero contribution of the sum X^eGjv 2 comes fro m £ = Ci = Cn 2 ■ We therefore obtain 

(7.33) 



1 ( 



1 ni 



m=l l<m\<iri2<n rl 



\l=X i=l £i. ^= V W 

x / A$(dr) / x 5 (da) [f^'-A) (* ri a,B o(0 (x)) - f(^'-^)(x)1 + 0(iV- 2 ) . 
j[o,i] l j 



Now follow the reasoning from (6.35) to (6.40), to get 

q 



(7.34) 



1 1 

J o = jp E E 

"1=1 £ : <t>£=V (m) 

+ 0(N- 2 ). 



[0,1] 



A^(dr) / x 5 (da) 



1 g^Xy) 

2dy v ( m )dy vim ) 



[r(—X£ + 5 a ),r(—X£ + S a )] 



For the second term Ji in (7.32), we obtain, by the definition of ^ r>a; B l t^)(x) in (1.38) and 
using (7.19), 

1 / 



(7.35) 



(®<8> E ) E ^ a L Ai( 

x [F(^'-^)(* rjaiBl(0 (x)) -i^i'-A)^) 

^(®<8> E ) E *-7 pi] A!< 



(dr) / ^(da) 



with 
(7.36) 



1(1- r)y n + r5 a , = rj, 
£ I X£, otherwise. 



Now observe that the sum X^eGjvi in (7.35) yields non-zero contributions only for n 6 
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h 



j?/ 1 ), . . . , 77^}, and so we can rewrite I\ as 

wA®® £ L a *m J E w& 

\l=l i=i^. ^=^(0/ i=l •'I ' 1 ] ^ 



a^i (8) • • • <g> ® ((1 - r)y ?? (i) + r<5 a ) 

change from position §j 



(7.37) 



It? 



to position 

f^iV- 1 / A*(dr) / j, (0 (dc 
7TT •/[o,ll •/ -E 



(i) 



,(<;) 



\ 2=1 / 



2 iV- 1 / AJ(dr) / yri (da)[F(^ a)V (y))-F(y)] 



Combining (7.32), (7.34) and (7.37), we obtain (7.29) on time scale Nt 



Xtl 



Additional Fleming- Viot part. We proceed as for the migration operator and write 
(4^F)(y)=(L^\Foy))( x) 



(7.38) 



, V- f f . v d 2 (Foy)(x) 

* E ^ L/4 



where the definition of y = in (7.18) yields 

<5 U 5„ 



( 7 - 39 ) -^r^z — - 



dx^dx^ 



N ' A" 



Hence, on time scale A^t, 



(7.40) 



* E * E 



Qa; f (d«,dv) 



92F(y) 



5 U S v 
AT' iV 



where in the last line we use that, for F a linear combination of the functions in (|1.33|) 

AT' N 



(7.41) N : 



. d 2 F(y) 
dyridyr. 



d 2 F(y) 
dy v dy v 



S u , S v ]. 
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The resulting generator. Combining the migration ( 7.27 ) , resampling ( 7.29 ) and Fleming- 



Viot (7.40) parts for the 1-block averages on time scale Nt, we obtain (7.20). This completes 



the proof of (7.6) 



7.1.3 The total average on time scale N 2 t 

Denote the total average by by 

(7.42) z = n- 1 y? = N ~ 2 E x t- 

(This is a 2-block average because we are considering the case K = 2.) We must prove: (1) 
{C[{Z^ N \tN 2 )) t > , N G N} is tight in path space; (2) the weak limit points of this sequence 



are solutions of the martingale problem (recall Section 6.1). From the uniqueness of the 
solution of the martingale problem we get the claim. 



To verify (1), we first observe that we have a martingale (compare with (7.43)) for every 



N. Hence, we want to show convergence to a continuous martingale, i.e., we have to apply 
tightness criteria for martingales. The necessary information is collected below by going 
through the various mechanisms step by step, which is the key to (2) as well, the convergence 
of the finite-dimensional distributions. 



Migration part. For the total average the migration operator can be obtained from (7.27) 
by writing z = z(y) and using the analogue to ( |6.26[ ), 



(7-43) (L^F)(z) = (L^(Foz))(y)= £ Cl / („£ - y v ) (da) 

Using that z = y£j = N' 1 Er/eGjv.i Vn for a11 V G G N,l, we get 
(7-44) (^F)(z) = (L^F)(z) = 0. 



dF(z) 
dz 



8a 

N 



Resampling part. Consider F(z) = (tp,z® n ). Follow the derivation of (6.31) to obtain 



(7.45) (L^F){z) = ^[® E ) Lres{F^-^) (y) =l' + l[ 



with Ffa,-,Vn)(y) = fa (g)™ =i y Vi } as m ( |6.30p , where we recall from ( |7.32[ ) that 

(^ l "* ) )(y) 



(7.46) 



E ) E f KWf^) 
\/=i tr- <t>Si=m) ^G N ,2 J[0 ' 11 JE 

F^-^\^, atBo{0 (x)) - F^-^(x) 




F^'-.«»)($ rjaiBl(0 (x))-Ffe 1 '-^)(x) 



Til , Til 

1 + J i ■ 
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Let us begin with the second term I" in (7.46), which corresponds to I\ in (7.32) and was 
rewritten in (7.35-7.37) as 



[0,1] 



(7.47) !'{= N ~' / A l( dr ) / Vvida) F^>~^ ($> r ,aM) ~ F^"^ (y) 



Combine (7.45) and (7.47), change to timescale Nt and compare the result to (6.32). We 
obtain that I[ on time scale Nt behaves analogously to (6.32) on time scale t. By moving 
one time scale upwards, we obtain as in (6.43) (respectively, (7.21) with d\ = 2^+A^+2d ^ ^) 
that 



(7.48) lim (I0 [2] 

N— >oo 



ciAi 



2ci + Ai + 2d 




i Je Je 



Q z (du,dv) 



d 2 F( 



dzdz 



The term I can be handled in the same spirit as Iq in ( |7.32 ). To obtain non-zero con- 
tributions in Iq, we need to have | = r)i,l < I < n}\ < n (recall ( 6.33| )). This is 
possible only if |r/i, . . . , r) n \ < n. Reasoning similarly as in (6.34), we obtain negligible terms 
if | = 7)1,1 < I < n}\ < n — 1. Indeed, two equivalent sites already result in a factor 
of 0(N~ 2 ) (on time scale t): first a common block has to be chosen (\r)i, . . . ,r) n \ = n — 1), 
which contributes a factor A~ 2 Y2veG n i> an< ^ subsequently a common site has to be chosen, 
which contributes a factor A~ 2 ^^.^ . Any additional choice results in terms that vanish 
for iV — )• oo on time scale N 2 t. Consequently, we can reason as in (6.35-6.40) to obtain 



(I'o) 1 



(7.49) 



- t - t 

]\T2 j\r2 



1 d 2 F(z) 



[04] 



AS(dr) 



x^(da 



2 



[r(-x 5 + <5 a ),r(-x 5 + <5 a )] + 0(N 



-3^ 



Additional Fleming- Viot part. We proceed as for the migration operator. Recall (7.40), 
to get 



(7.50) (4T [1] F)(z) = d £ 1 £ 



A 



Q x (du, dv) 



dy v dy v 



Now use the analogue to (7.39), to obtain 
(7.51) (<' 2)[11 ^)W = ^o E ^ E 



2 



Q x , (dn, dw ) 



d 2 F{z) 
dzdz 



6u &v_ 

N' N 



After changing to time scale N t, we have 



(7.52) I^fv 



d °A7 E a7 E 

r?eGiv,i f: 0C=TJ 




Qa; e (du,du) 



a 2 F(z) 

dzdz 



Tightness. We have to bound the generator and apply the tightness criterion, as explained 
in Section [6.1.11 We omit the details. 
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Convergence to McKean-Vlasov process. We have to identify the limiting generators. 
One approach would be to try and make the following heuristics rigorous. 

Begin heuristics. On time scale N 2 t, we obtain, by reasoning as in (7.21), 



Urn (I>)W 

N— >oo 



Ao 



lim 

2 jv^oo N 



1 E 



v 



V{E) 



co,do,Ao 



(dx) / / Q x (du, dv) ^ ^ J [S u , S v ] 
Je Je 



d 2 F(z) 
dzdz 



(7.53) 



coA 

lim 

2c + A + 2d N^oo N 




2co + Ao + 2do j-p(E) 

2ci c A 



1 y 

v c z ^ M (dy) 



E JE 



Q y (du, dv) 




Q y (du,dv) 



d 2 F(z) 
dzdz 

d 2 F(z 



E JE 



2ci + Ai + 2di 2c + A + 2d J E J E 




Q z (du,dv) 



dzdz 
d 2 F(z 



[8 U , S v ] 



dzdz 



Combine (7.48) with (7.53), to get 



(7.54) 



lim (4^F)(z) 

N— >oo 

2ci f\i 
~ 2ci + Ai + 2d 1 



+ 



CqAq 



2c + A + 2d 




E JE 



Q z (du,dv) 



d 2 F(z) 
dzdz 



[&u, $v] 



For the Fleming- Viot part, we obtain, by reasoning once more as in (7.21), 

d 2 F(z) 



lim (L^ y 

N— >oo 



co,cIq,Aq 



(7.55) 



d lim — / v.. 

N^oo N ^ J V(F) Vr > 

2c Q d 1 
— lim — > 

2c + A + 2d n^oo N ^ 



(dx) 




Q x (du,dv) 



E JE 



dzdz 



8u,$v} 



Q y (du,dv) 



E JE 



V 



ci,di,Ai 



(dy) 



2 cp dp 
2co + Ao + 2do J-p(e) 

2ci 2c d 

2ci + Ai + 2di 2c + A + 2d J E Je 




E JE 



Q y (du,dv) 



d 2 F{z) 
dzdz 

d 2 F(z) 
dzdz 



[s u ,s v \ 

[8 u ,fiv 



Q z (du,dv) [5 U , S v ] . 

dzdz 



Collecting the limiting terms as N — > oo on time scale N t for migration (7.44), resampling 
(7.54) and Fleming- Viot (7.55), we obtain 



(7.56) 



lim (L( N ^F)(z) 
2ci 

~ 2ci + Ai + 2di 



Ai cqAq + 2cpdo 
2 2c + A + 2d 




Q z (du,dv) 



E JE 



d 2 F(z) 
dzdz 



[S U ,S V ]. 



In order to obtain the convergence in (7.53-7.55), we would need to restrict the set of config- 
urations, argue that the law of the process lives on that set of configurations, and show that 
therefore the compensators of the martingale problems converge to the compensator of the 
limit process. However, it is technically easier to follow a different route, as we do below. End 
heuristics. 
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We want to view the expression for (L^ N ' 2 ^'^F)(z) as an average over N different 1-block 
averages. If we replace the 1-block averages by a system of N exchangeable Fleming- Viot 
diffusions with resampling constant d\ (for which we have a formula in terms of co,do and 
Ao), which on time scale Nt lead to the generator 

E E 

then we can apply the analysis of Section [6] to this new collection of processes, denoted by 
(7.58) {Y} N \tN),i = l,...,N}, 

to conclude that the block average Z( N \tN) = N- 1 Y, Yi N (Nt) satisfies 



8=1 



(7.59) C[{Z^ N \tN 2 )) t>Q ] =► £[(Z(t)) 4>0 ], 

JV— ^00 

where Z is a Fleming- Viot diffusion with resampling constant 

(7.60) ^— (X 1 + 2d 1 ), where d!- C °( A ° + 2d °) 



2c x + Ai + 2di v ±y ' 2c + Ao + 2d 

Hence we obtain a limit process with a generator acting on F as 



2ci + Ai + 2d\ J E J E dzdz 



Hence, the weak limit points of the laws {C[(Z^ (tN 2 ))t>o], N G N} satisfy the martingale 
problem with generator (Lg' d2 ' 5 °G)(z) with cfo = 2^1+^+2^1 • 

Since we know that the martingale problem for the generator L°' d '^ and for the test 
functions given by C?{V(E),M) is well-posed, we have the claimed convergence in (7.7) on 
path space if Z (a weak limit point for the original problem) and Z agree. Thus, we have to 
argue that it is legitimate to 

(7.62) replace {((y} N) (Nt)) i=1 _ N ) t > } by {(y} N) (Nt) i=1 ^ N ) t > }. 

For that purpose, observe that we know from Section [6] that, for a suitable subsequence 
along which C[{Z^ N \sN 2 )) s>o] converges to Z(s), 

(7.63) £[((y/ W) (iV 2 S + iVt)) i=1 ,...^)i> ] — > C[(Yf°°\s,t)) t > ], 

N—*oo 

where the right-hand side is the McKean-Vlasov process with Fleming- Viot part at rate d%, 
Cannings part Ai, and immigration-emigration at rate c\ from the random source Z{s). We 
need to argue that the latter implies that Z and Z agree. 

For F G (%(V(E),m), define G N G C 2 ((V(E)) N ,R) and H N G Cl((V(E)) N ' 2 ,R) by 

(7.64) F(z) = G N (y) = H N (x), x G {V{E)) N \ y G (V(E)) N , zeV(E), 
with 

(7.65) z = — yi= N 

ie{i,...,N} je{i,...,N} 
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In order to verify that Z and Z agree, it suffices to show that the compensator processes for 
Z and Z agree for a measure-determining family of functions F E C?(V(E),W), namely, 



C 



(7.66) 



c 



ds 



ExE 



tN 2 



ds 



L 



TV-s-oo 



Zero measure. 



dytdy. 

^ N N 

AT2 Z^, ^ ''--■<■< 

i=l j = l 



t>0- 



L™H N (x 



J,' 



») 



To that end, first note that the two terms with L^'^'^ 1 ' cancel each other out. Regarding the 
remaining terms, after we transform s to sN 2 , we must show that for each s G [0,i] the term 
in the second line converges weakly to the term in the first line (the joint law of the density 
and the empirical measure converges). When worked out in detail, this requires a somewhat 
subtle argument. However, nothing is specific to our model: a detailed argument along these 
lines can be found in jDGV95j . pp. 2322-2339. 



7.2 Finite-level systems 

The next step is to consider general K > 3 (recall the beginning of Section [7]) . We can copy 
the arguments used for K = 2, and argue recursively, namely, we can view the — + 
block averages as a two-level system on time scales tN 3 , NftN^ 1 ), N 2 (tN 3 ~ 1 ).Tlie limit as 
N — > oo is a two-level system with migration rates Cj_i, Cj, Cj+i instead of Co, c\, C2, resampling 
measures Aj_i, Aj, Aj+i instead of Ao,Ai,A2, and volatility dj_i instead of do- If we would 
have cq = c\ = ■ ■ ■ = Cj_2 = and Ao = • • • = A,_2 = 0, then this would be literally the 
case. Hence, the key point is to show that the lower-order perturbation terms play no role 
in the renormalised dynamics after they have played their role in determining the coefficients 
dj-i,dj,dj+\. 

The argument has again a tightness part, which is the same as before and which we do 
not discuss, and a finite-dimensional distributions part. Since the solution of the martingale 
problem is uniquely determined by the marginal distributions (see [EK86 ; Section 4.4.2]), this 
part is best based on duality, which determines the transition kernel of the process as follows. 

We have to verify that the dual of the (j + l)-level system on the time scales N^ 1 t, NH 
behaves like the dual process of a two-level system. This means that the dual process can be 
replaced by the system where the locations up to level j — 2 are uniformly distributed and 
all partition elements originally within that distance have coalesced. This can be obtained 
by showing that the dual system with the lower-order terms is instantaneously uniformly 
distributed in small balls, and that within that distance coalescence is instantaneous, since we 
are working with times at least tN^~ l . Therefore, the dynamics as N — > oo results effectively 
in a coalescent corresponding to a two-level system. 



8 Proof of the hierarchical mean-field scaling limit 



We are finally ready to prove Theorem 1.4 In this section we approximate our infinite spatial 
system by finite spatial systems of the type studied in Section [7j As before, we denote the 
finite system with geographic space Gn^k by X^ N,K > and the one with G = by X^ N \ 
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Proposition 8.1. [X-level approximation] 

For t G (0,oo) and sjy G (0, 00) with liniAr^oo sjy 



Y c { ? n) and Y^ K) on time scale tW + s N N k for < k < j <K. Then 



(8.1) d Prokh (C [(Y$ K \tN> + s N N k ) 
where dp ro kh is the Prokhorov metric. 



00 and liniAr^oo sn/N = 0, consider 
hen 

Y^' K \tNi + s N N k )])) 0, 



Once we have proved this proposition, we obtain Theorem 1.4 by observing that (8.1) 



allows us to replace our system on Q,jy by the one on G N,K when we are interested only in 
block averages of order < K on time scales of order < N K . In that case we can use the result 
of Section [7] to obtain the claim of the theorem for (j,k) with k < j < K. Thus, it remains 
only to prove Proposition 8.1 We give the proof for K = 1, and later indicate how to extend 
to K G N. 

The main idea is the following. We want to compare the laws of the solution of two 
martingale problems at a fixed time and show that their difference goes to zero in the weak 
topology. To this end, it suffices to show that the difference of the action of the two generators 
in the martingale problems on the functions in the algebra J- tends to zero. Indeed, we then 
easily get the claim with the help of the formula of partial integration for two semigroups 
(V t )t>o and (U t ) t >o (see e.g. Ethier and Kurtz |EK861 Section 1, (5.19)])): 



.2) V t = U t + 



1 



Lu)V s ds. 



In Sections 8.1 8.2 we calculate and asymptotically evaluate the difference of the generator 
acting on T on the two spatial and temporal scales. 

8.1 The single components on time scale t 

For an F that depends only on xg, £ G -Bi(O), we have, as we will see below, 

(8.3) (L^F)(x) = (L (7V ' 2) F)(x) + O^ 1 ) 

where the error term is uniform in x and only depends on the choice of F. By the formula of 
partial integration for semigroups, it follows that 

(8.4) E \F(X {nN \t))] -E \F(X( N ' K \t))\ < ^(A^ 1 ). 

Since our test functions are measure-determining, the claim follows for any finite time horizon. 



To prove (8.3), we discuss the different parts of the generators separately. 



Consider the migration operator in (1.36) applied to functions F G J 7 , the algebra of 



functions of the form in (1.33). The migration operator can be rewritten, similarly as in (7.8) 



.5) (L^F)(x) = Ep^W 



£en N ken 



8F(x) 
dx£ 



[«.]■ 



We obtain 



•6) (L^F)(x)= E co/ (y^ 1 -x^(da) d ^[5 a ] + E^ 
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where 



(8.7) lE^l < N^Cp ^2 c k-iN 2 ~ k , 

keN\{l} 

with Cf a generic constant depending on the choice of F only. Here we use that, by the 



definition of F in (1.33), the sum over £ S Qn is a sum over finitely many coordinates only, 



with the number depending on F only. By (1.26) we get 
(8.8) \EW\ < OiN- 1 ) . 



For the resampling operator we obtain, applying first (1.38) and then (1.31), similarly as 



in (1.37), 



(8.9) (Lg^F)(x) = W A (dr) / x^da) [F (% a , Bo{0 (x)) - F(x)] +#' 



with 



5.10) \EW\<J2 N ~ 2k f Al{dr)C F N k r 2 = C F Y J N ~ kx k = 0{N- 1 ) . 



Finally, the Fleming- Viot operator reads as in ( |7.13 ): 



.11) (L?^ F)(x) = d 



Q x (du,dv) 



E JE 



d 2 F(x) 
dx^dx^ 



and the Fleming- Viot part in (8.11), we obtain 



Combining the migration parts in (8.6) and (8.8), the resampling parts in (8.9) and (8.10) 



1.12) 



(L^F)(x) = £ c f (y 5il -^)(da)^^[<y + 0(iV- 1 ) 

+ J2[ A S( dr ) / ^(daj^VofflWl-^Wl+Or 1 ) 



+ 7] / / Q^(dn,df) 



d 2 F(x) 
dx^dx^ 



[s u ,s v ]. 



Combining (8.12) with (8.5 — 8.11) and (7.14) (also recall the discussion on embeddings from 



Section 5.2), we get the claim. 



8.2 The 1-block averages on time scale Nt 

As before we prove, for F depending on £ G B\ (0) only, 

(8.13) (L (n ~ )[1] )(y) = (L^ N ' 2 ^F)(y) + 0(N- 1 ) 



after which the claim follows in the limit as N — > oo by the same argument as in Section 8.1 



We prove (8.13) by considering separately the different parts of the generator. 
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For the 1-block averages y = the migration operator can be calculated as in (7.25). 



Using (7.26), we get 



y l X, ~ Vr > ) (da 



(8-14) (L^F)(y) = ± Yl I 
We obtain on the time scale Nt 

(8-15) (L^F)(y)= 5>| {yK-y^daf-^M + EW, 



where 



5.16) 



E (N) 



<C F Y, CkN 1 ^ = OiN- 1 ) . 
fceN\{i} 



Note that, by ( |7.27[ ), 

(8.17) (L^ [1] F)(y) = (L^f [l] F)(y) + 0(N~ 1 ) 



(8.18) 



For the resampling operator, the only change to (7.31) is that (7.32) gets replaced by 
N) F )( y ) = I + I 1+ E^ 



with Io,h as in (7.32) (with Gn,2 replaced by Q,^) and 
1 / . 



5.19) 



E (N) 



q n t 



< 



N L 



\i=i i= i e |. ^i= v (i)/ keN\{i} 

= C F Y, N- k X k = 0{N- 2 ). 
fceN\{i} 

After a change to time scale Nt, we therefore have 
(8.20) (L^F)(y) = (L^F)(y) + OfN' 1 ) 



V N~ 2k [ 



A* k (dr)LN k C F r 2 



with (L [ r es 2) F)(y) as in §L3l} . 

The Fleming- Viot operator on time scale t reads as in (7.38), respectively, on time scale 
Nt as in fl7.40| ), 

(8.21) (L%^F)(y) = (L^ 2)[1] F)(y). 
8.3 Arbitrary truncation level 

For every K E N, consider the block averages up to level K — 1 on time scales up to N K t, 
estimate the generator difference, bound this by an 0(./V -1 )-term and get the same conclusion 
as above. There are more indices involved in the notation, but the argument is the same. The 
details are left to the reader. 
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9 Multiscale analysis 
9.1 The interaction chain 



In this section, we prove Theorem 1.8 In addition to Theorem 1.4, what is needed is the 
convergence of the joint law of the collection of £;-level block averages for k = 0, . . . ,j + 1 on 
the corresponding time scales NHjy + N k t, with limjv->oo ijv = °° and limjv^oo tw/N = 0. We 
already know that the £-block averages for i > k do not change on time scale tN k and that 
this holds in path space as well. Hence, in particular, the (j + l)-block average converges to 
a constant path at times NHjy + N k t for all < k < j. We also have the convergence of the 
marginal distributions for each fc = 0,...,j + l, namely, we know that the process on level k 
solves a martingale problem on time scale tN k , which we have identified and where only the 
block average on the next level appears as a parameter. Therefore, arguing downward from 
level j + 1 to level j, we see that the Markov property holds for the limiting law. It therefore 
only remains to identify the transition probability. 

We saw in Section [7] that when going from level k + 1 to level k, we get the corresponding 
equilibrium law of the level-k limiting dynamics as a McKean-Vlasov process with parameters 
(ck,0,dk,Ak) with 9 equal to the limiting state on level k + 1. Note here that, instead of 
N k+1 s + N k t, we can write N k+1 s + N k t N with lim7v-s>oo £jv = co and liniAr^oo i7v/iV = 0, 
since an o(l) perturbation of s has no effect as N — > oo. For more details, consult [DGV95I 
Section 5(f)]. 

In the remainder of this section, we prove the implications of the scaling results of (dk)keN 
for the hierarchical multiscale analysis of the process X^ N \ involving clustering versus co- 



existence (Section 9.2), related phase transitions (Section |9.1[), as well as a more detailed 



description of the properties of the different regimes (Section mM), as discussed in Section 1.5.5 



9.2 Dichotomy for the interaction chain 

In this section, we prove Theorem [L9| Fix j £ No. The first observation is that the interaction 
chain {Mj^ ) k= _^ + i^ ^ is a P(£')-valued Markov chain such that, for any <p G C^(E), 

(9.1) ((Mjp\ V 3 ))/ C= _(j +1 ) Q is & square-integrable martingale 

(because it is bounded). For the analysis of the interaction chain for Fleming- Viot diffusions, 
carried out in [DGV951 Section 6], this fact was central in combination with the formula for 



the variance of evaluations analogous to Proposition 4.5 We argue as follows. 

Since the map 9 i— > Ug' d ' is continuous, the convergence as j — > oo in the local coexistence 
regime is a standard argument (see |DGV95| Section 6a]). In the clustering regime, the 
convergence to the mono-type state follows by showing, with the help of the variance formula, 
that lim^ooE (j) [Var-j^)] = for all (p € C^(E), so that all limit points of £[M( J )] are 

concentrated on 5-measures on E (recall that V{E) is compact). This argument is identical 
to the one in |DGV95[ Section 6a] . The mixing measure for the value of the mono- type state 
can be identified via the martingale property. 

It remains to show that in the case where E (j) [Var x (cp)] is bounded away from zero, 

the limit points allow for the coexistence of types. The argument in [DGV95, Section 6a] 
shows that for A = 5o, 

(9.2) Ug' d,A (M) = if d > 0, M = {5 U : u G E}. 
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This is no longer true for A ^ 5q. Instead, we have Vq [M) e [0,1), as proven in Section 



(see (4.12)), and hence the variance is > 0. 



4.3 



9.3 Scaling for the interaction chain 



In this Section we prove Theorems 1.11 and 1.12 



The proof of the scaling result in the regime of diffusive clustering in |DGV95t Section 
6(b), Steps 1-3] uses two ingredients: 

(I) The following processes is a square-integrable martingale: 

(9.3) ({M® ,/)) V/£C b (J?,l). 

V / k=— (j+l),...,0 

(II) For c k ^c£ (0,oo) as k -> oo, by |DGV951 Eq. (6.12)], 

(9.4) Var ((ikfg, /) | M% =6) = ^ ~ + 1 Var (/), V/6^,1). 

{-h)/j = Pi G [0,1], 



In [DGV95, Section 6(b)], (I— II) led to the conclusion that if lim^oo 
i = 1,2, with fii > fa, then 

(9.5) lim Var ((M^ , f) \ M® =9) = Var e (/). 

Thus, as soon as we have these formulae, we get the claim by repeating the argument in 
[DGV95, Section 6(b)], which includes the time transformation (3 = e~ s in Step 3 to obtain a 



time-homogeneous expression from (9.5) 



We know the necessary first and second moment formulae from Section |4.4| Replace 



|DGV95l Eq. (6.12)] by (|4.27[), to see that we must make sure that 



[Ml / , \firfl 
(9.6) lim Y 



mi 



r 



Note that (9.6) remains valid also for /?2 = 0. 



Moreover, by following the reasoning in [DGV95, Section 6(b), Step 4], we obtain by using 



427] instead of [DGV951 (6.34)] that 
fast growing clusters 



(9.7) 



d 



slowly growing clusters [ ^— ' \ c, JJ ] 
when m, n — > oo such that n/m — > a, for all a E (0, 1). 



i+l 

°i i mi 1 + m Z 
t=«+l 



Proof of Theorem 1.11, The proof follows by inserting the asymptotics of c^, d^ and m& ob- 
tained in Theorem 



1.6 



and Corollary 1.10 into (9.6) or (9.7). 



(i) In Cases (a) and (b), the asymptotics in (1.51-1.52) and (1.73) imply 

di- 



(9-8) £ 

i=[amj \ l=i+l 



H+l 
Ci 



mi 



O (e~ Cn ) , C>0. 
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In Case (c), using the fact that dj+i/cj ~ rrii — > and ^/eN mi = 00 ' we obtain 



(9.9) Yl 

i= [am] 



di+i 

Ci 



{=*+! 



(ii) In Case (d), for any e > and 2 large enough we have |m/ — R/l\ < eR/l. This implies 



(wo) nrr^ = «p - E (t +0( '™' ) ) 

Since (fj+i/cj ~ i2/i and m; = 0(1//), it follows that 

LA>jJ 



(9-11) £ 



n r 

Z=i+1 



+ 771/ 



IAiJ T-> / a -\ -R 

i=L/32iJ 



□ 



Proof of Theorem 1.12. In Case (A), — )• 00, which implies fast clustering. In Case (B), 
TTtfc — )• K + M > 0, which implies fast clustering. In Case (CI), ~ (cfcUfc) -1 — >• C > 0, 
which implies fast clustering. In Case (C2), dk/c^ ~ ~ (1 — c)/c > 0, which implies fast 
clustering. In Case (C3), dk/ck ~ rrik ~ ^fc/( c fc(^ ~~ !))> which implies fast, diffusive and slow 
clustering depending on the asymptotic behaviour of kri k /ck- □ 



10 Dichotomy between clustering and coexistence for finite N 



In this section, we prove Theorems 1.13-1.14 



Proof of Theorem 1.13. The key is the spatial version of the formulae for the first and second 
moments in terms of the coalescent process. The variance tends to zero for all evaluations 
if and only if the coalescent started from two individuals at a single site coalesces into one 
partition element. Therefore, all we have to show is that the hazard function for the time to 
coalesce is Hn, and then show that lmijv_ s . 00 Hn = 00 a.s. if and only if limAr_ >00 Hn = 00. 
The latter was already □ 



Proof of Theorem 1.14. We first note that the set of functions 
(10.1) {Hjri(;ir G>n ): neN,ipeC h (E n ,R),Tr G>n eIl G>n }, 

is a distribution-determining subset of the set of bounded continuous functions on V(V(E)) G . 
It therefore suffices to establish the following: 

(1) For all initial laws £[X( njv )(0)] satisfying our assumptions for a given parameter 9 € 
V{E) and all admissible n,(p,TV Gin , 



(10.2) E\H^(X^\t),n G , n ) 



F((cp,n,7r G)n ),e), 



which implies that C[X^ N \t)} converges to a limit law as t — > 00 that depends on the 
initial law only through the parameter 9. 
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(2) Depending on whether Hn < oo or Hn = oo, the quantity in the right-hand side of 



(10.2) corresponds to the form of the limit claimed in (1.79-1.80). 



Item (2) follows from Theorem 1.13 once we have proved the convergence result in (10.2) 



since (1.80) implies that the marginal law of the limiting state is 8$, and we will see in (10.5) 



below that recurrence of the transition kernel a implies that 



(10.3) E<^ 



<¥>> 



i=l 

which in turn implies 



</»,0>, for pfai, 



L n) = \\f{ui), 



i=l 



(10-4) u { tl 



(8 u )® n »e(du). 



K 



In or der to prove item (1), we use duality and express the expectation in the left-hand side 



of (|10.2|) as an expectation over a coalescent <^ N ^ starting with n partition elements. We 



therefore know that the number of partition elements, which is nonincreasing in t, converges 
to a limit as t — > oo, which is 1 for Hjy = oo and a random number in {1, . . . , n} for Hn < oo. 
This means that there exists a finite random time after which the partition elements never 
meet again, and keep on moving by migration only. For such a scenario, it was proven in 
[DGV95], Lemma 3.2, that the positions of the partition elements are given, asymptotically, 
by k = 1, . . . , n random walks, all starting at the origin. Using that the initial state is ergodic, 
we can then calculate, for (p(ui, • • ■ ,u n ) = n^ =1 /(iife), 



(10.5) }™e[h^ (xV«\0),<L? n) )} =E(/'^fc 



k=l 



with q k ,n the probability that the coalescent starting in 7TG,n i n the limit has k remaining 
partition elements. Furthermore, if the initial positions of a sequence (7ig^) me N of initial 
states satisfies linim-^oo d(^ m ^ , rj ^ ) = oo for i ^ j, then for transient a we know that 



(10.6) lim q k 



(4 m 2) 



0, V k = 1, . . . , n — 1 and lim q : 

m— >oo 



In view of (10.5), this proves that the law on (V(E)) G defined by the right-hand side of (10.2) 



is a translation-invariant and ergodic probability measure, with mean measure 9 (see |DGV95] , 
p. 2310, for details). □ 



11 Scaling of the volatility in the clustering regime 

In Section 11.1 we prove Theorems 1 1 . 5| and 1.15 in Section [ll.3| we prove Theorem 1.6 
11.1 Comparison with the hierarchical Fleming- Viot process 



(a) Rewrite the recursion relation in (1.42) as 



(11.1) d = 0, 



+ 



<4+i Cfc u. k + dk 



k G N . 
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From ( |11.1[ ), it is immediate that c i— > d and jtii-> d are component-wise non-decreasing. 

(b) To compare d with d* , the solution of the recursion relation in (1.48) when hq > and 
[i k = for all k £ N, simply note that d\ = d\ = co//o/(co + /-to)- This gives 

(11.2) 4 > 4, fee N, 



with d* k given by ( |1.49[ ). 

(c) Inserting the definition m k = (fi k + d k )/c k i n t° ( |11.1[ ), we get the recursion relation 

c k m k 



(11.3) c m = u. , 



l + m k 



k £ N . 



Iterating (11.3), we get 

k 

(11.4) c k m k 



snL(n 



z=o iii= f ^ + "»j) 
Ignoring the terms in the denominator, we get 



(11.5) m fc < — y~]fMi, 



1=0 

which proves that X]fceN (V c fc) J2t=o W < 00 implies X^fceN mk < 00 • ^° P rove the reverse, 
suppose that X] fcgNo mk ^ °° - Then j gNo ( + m i) = C < 00 • Hence (11.4) gives 

1 1 k 
(11.6) m k > — — ^2 Hi, 

Ck 1=0 

which after summation over k £ No completes the proof. 



(d) We know from (1.49) that d k > d* k = + Ho&k) for k 6 N. Hence, if lim^oo Ofc = oo, 

then liminffc^oo <7fcc£& > 1. To get the reverse, note that iteration of ( |11,1[ ) gives 



1 

dh 



k-l 



;n.7) 



l=0 Qn-=li(i + ^: 



= E 

1=0 

k-l 

Z=0 



> 



£ 

z=o 



k-l /-i i ' 



Qnr= w (i+2[i+wi)' 



If ^2j e jq^jHj < oo, then the product in the last line tends to 1 as I — > oo. Hence, if also 
limfc-s.oo cifc = oo, then it follows that liminffc_ >0O (l/(Tfedfc) > 1. 

Note from the proof of (c) and (d) that in the local coexistence regime d k ~ Ya=o A*i as 
k — > oo when this sum diverges and d k — > X)zeN A*// Iljlz(l + m j) £ (0, oo) when it converges. 
We close with the following observation. Since l/c k a k = (a k+ \ — a k )/a k , k £ N, and 



(Hi 



Q~k > Q~k+1 



0~k 



> 



o-i 



o~ k 



<Jk+l 

X 



k € N, 



we have 



(11.9) lim a k 

k— >oo 



1 



OO 



V — 

t^i ck<Tk 



oo. 



Renormalisation of hierarchically interacting Cannings processes 



80 



11.2 Preparation: Mobius-transformations 

To draw the scaling behaviour of d k as k — > oo from ( |11.1[ ), we need to analyse the recursion 
relation 

(11.10) x = 0, x k+1 = f k (x k ), /c£N , 
where 

(11.11) /*(*) = CkX + c f k x^-(c k + n k ). 

x + (cfc + n k ) 

The map x i— > f k {x) is a Mobius-transformation on M* , the one-point compactification of R. 
It has determinant c k (c k -\- u. k ) — c k [i k = c\ > and therefore is hyperbolic (see Kooman [K98J; 
a Mobius-transformation / on M* is called hyperbolic when it has two distinct fixed points at 
which the derivatives are not equal to —1 or +1.) Since 

(11.12) f k {x) = ( ° k ) , x + -(c k + fi k ), 

\X + (Cfc + jl k ) J 

it is strictly increasing except at x = —(c k + u. k ), is strictly convex for x < —{c k + fj, k ) and 
strictly concave for x > —(c k + fj, k ), has horizontal asymptotes at height c k at x = ±oo and 
vertical asymptotes at x = — (c k + /i k ), and has two fixed points 

(11.13) x\ = liM k [-l + y/l + 4c h /ii k ] G (0,oo), xl = \n k [-l- y/l + 4c k /ti k \ G (-oo,0), 

of which the first is attractive (f k {x k ) < 1) and the second is repulsive (f' k {x^) > 1). For us 
only x^ is relevant because, as is clear from (11.10), our iterations take place on (0,oo). See 
Fig. [5] for a picture of f k . 



fk{x) 









•'•/,■ / 












4 



X 



Figure 5: The Mobius-transformation x h->- f k {x). 



In what follows, we will use the following two theorems of Kooman [K98J. We state the 
version of these theorems for M, although they apply for C as well. 
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Theorem 11.1. [Kooman [K98j . Corollary 6.5] 

Given a sequence of Mobius-transformations (fk)keN on ^* that converges pointwise to a 
Mobius-transformation f that is hyperbolic. Then, for one choice of xq G R* the solution of 
the recursion relation x k+ i = fk(x k ), k G No, converges to the repulsive fixed point x~ of f, 
while for all other choices of xq it converges to the attractive fixed point x + of f . 



Theorem 11.2. [Kooman |K98j . Theorem 7.1] 

Given a sequence of Mobius-transformations (fk)keN onM* whose fixed points are of bounded 
variation and converge to (necessarily finite) distinct limits, i.e., 



(11.14) 



E 

fceNo 



X k+1 X k \ < OO 



X k+1 X k I < °°' 



fceNo 



x + = lim £ 1*, x = lim x k G M* , x + ^ x 

k—>oo k— ¥oo 



(n.15) n \ fk(4)\=o> 



fceNo 



then, for one choice of xq G W, the solution of the recursion relation Xk+i = fk(x k ), k G Nq, 
converges to x~ , while for all other choices of xq it converges to x + . If, on the other hand, 

(n.16) n i/fe(4)i>°> 

fceNo 

then all choices of xq £ M* lead to different limits. 



Theorem 11.1 deals with the situation in which there is a limiting hyperbolic Mobius-transfor- 



mation, while Theorem 11.2 deals with the more general situation in which the limiting 
Mobius-transformation may not exist or may not be hyperbolic, but the fixed points do 



converge to distinct finite limits and they do so in a summable manner. (In Theorem 11.1 
it is automatic that the fixed points of fk converge to the fixed points of /.) The conditions 



in ( 11.14 11.15 ) are necessary to ensure that the solutions of the recursion relation can reach 



the limits of the fixed points. Indeed, condition (11.16) prevents precisely that. As is evident 



from Fig. [5j the single value of xq for which the solution converges to the limit of the repulsive 
fixed point must satisfy xq < 0, which is excluded in our case because xq = 0. 



11.3 Scaling of the volatility for polynomial coefficients 



Theorem 1.6 shows four regimes. Our key assumptions are (1.55-1.58). For the scaling 
behaviour as k — > oo of the attractive fixed point x k given in (11.13) there are three regimes 
depending on the value of K: 

Cfc, if K = oo, 

(11.17) x+ ~ \ M+c k , if K G (0, oo) with M+ = \K[-1 + + (4/if)], 
yjc k u. k , if K = 0. 



Our target will be to show that (recall x k from (11.10)) 



(11.18) x k ~ xt as k — > 



oo, 
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which is the scaling we are after in Theorems 1.6^a— c). We will see that (11.18) holds for 



K G (0,oo], and also for K = when L = 00. A different situation arises for K = when 
L < 00, namely, Xk ~ 1/cfc, which is the scaling we are after in Theorem |1.6|(d). 



For the proofs given in Sections 11.3.1 11.3.4 below we make use of Theorems 11.1 



11.2 after doing the appropriate change of variables. Along the way we need the following 
elementary facts: 

(I) If (dfc) and (bk) have bounded variation, then both (a& + bk) and (aj-bk) have bounded 
variation. 

(II) If (cifc) has bounded variation and h: R — > R is globally Lipschitz on a compact interval 
containing the tail of (a&), then (h(a k )) has bounded variation. 

(Ill) If (<ifc) is bounded and is asymptotically monotone, then it has bounded variation. 

Moreover, the following notion will turn out to be useful. According to Bingham, Goldie and 
Teugels [BGT87, Section 1.8], a strictly positive sequence (a&) is said to be smoothly varying 
with index p G K if 



(11.19) lim k n a [ k n] /a k = p{p - 1) x • • • X (p - n + 1) 

k— >oo 



n G N, 



where a k is the n-th order discrete derivative, i.e., a k = Ofc and = o^™ 



fe+i 



(IV) If (afc) is smoothly varying with index p ^ No, then (a,u ) is asymptotically monotone 
for all n G N, while if p G N, then the same is true for all n G N with n < p. 

This observation will be useful in combination with (I III) . 

According to [BGT87, Theorem 1.8.2], if (a^) is regularly varying with index p G M, 
then there exist smoothly varying (a' k ) and with index p such that a' k < at < a'^ and 
a 'k ^ a k- ^ n words, any regularly varying function can be sandwiched between two smoothly 
varying functions with the same asymptotic behaviour. In view of the monotonicity property 



in Theorem |L5[a) , it therefore suffices to prove Theorem 1.6 under the following assumption, 



which is stronger than (1.55): 

(cfe), (pk), {^k/ck), {k 2 pk/c k ) are smoothly varying 



(11.20) 



(with index a, b, a — b, respectively, 2 + a — b). 



11.3.1 Case (b) 



Let K G (0,oo). Put yk = Xk/ck- Then the recursion relation in (11.10) becomes 

(11.21) y = 0, y k+ i = gki.Uk), k G No, 
where 

(11.22) g k (y) = ^— y G 



with coefficients 



(11.23) A k 



Ck+l 



Bk 



Ck^k 
Ck+l ' 



Ck = c k , Dk = Ck + Pk- 
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By (|1.55P, we have c k /c k+ i ~ 1, and hence A k ~ C k ~ c fc , 5 fc ~ ifc fc , D fc ~ (K + l)c fc . 



Therefore, (11.22) yields 



(11.24) lim g k {y)=g{y) = 
fc-s>oo y + (A + 1) 

Since g is hyperbolic with fixed points y^ 
Theorem 111.11 and conclude that 

(11.25) lim y k = M + . 

k— >OD 

11.3.2 Case (a) 



y e 



\K[-l± v 7 ! + (4/iT)], we can apply 



Let K = oo. Again put y k = x k /c k . Then the same recursion relation as in (11.21-11.22) 
holds with the same coefficients as in (11.23), but this time c k /c k+ \ ~ 1 gives A k ~ C k ~ c k , 
B k ~ D k ~ n k , and 



(11.26) lim g k (y) = g(y) = 1, y £ R* . 
k— >oo 

Since g is not hyperbolic, we cannot apply Theorem 
note that g k has fixed points 



11.1 



(H-27) y£ 



1 

a k 



^(bk/af.) with ^{x) 



1 

2x 



1=F Vl + 4x), a fc 



To compute y ± = lim fc -^oo U k , we 
A k -D k , 



-Bfc 



11.2 



To prove 



(use that a k < for k large enough). Since c k j[i k — > 0, we have a k —¥ —1 and b k — > 0. It 
follows that y k — > y + = 1 and y k ^y~ = -co, so that we can apply Theorem 
that y k — >• y + = 1, we need to check that (recall (11.14-11.15)) 

(1) (y^")fcgN has bounded variation. 

(2) n fc6 N s*W = °- 

(What happens near is irrelevant because x k > for all A; 
To pro 

(11.28) a fc 



To prove (1), note that h + is globally Lipschitz near zero. Since, by (11.23) and (11.27), 

Cfc+l 



Cfc 



1 



Cfc 



Cfc+l 



Cfc 



frfc 



Cfc Cfc+l 



^fc Cfc 



it follows from (1.56), (I), (III-IV) and (11.20) that (a k ) and (b k ) have bounded variation. 
Since a k — > —1 and b k — > 0, it in turn follows from (I II) that (1/afc) and {b k /a\) have bounded 
variation. Via (I— II) this settles (1). 
To prove (2), note that 



(11.29) g' h (y+) 



(C k y+ + D k f 
Since y k > and D k > fi k , we have 



with Afc = A k D k - B k C k . 



(11.30) J] g' k {y+) < J] 



fcGN 



fceN 



Afc 

4 



But Afc = c|/cfc+i and so, because c k /c k+ \ ~ 
Hence (2) indeed holds. 



1, we have A fc /^ = c 3 k /c k+1 ul ~ (c k /fi k ) 2 -> 0. 
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11.3.3 Case (c) 

Let K = and L = oo. Put yt = Xk/^/ckjlk- Then the same recursion relation as in 
(lll.2lHll.22p holds with coefficients 



(11.31) A k = c k 



Bk = c k \i k \ 



1 



By (1.55), c k+ i/c k ~ 1 and fik+i/Vk 



1, and hence A k 



Therefore ( 11.22| ) yields 



Ck = y/Ck/Ik, Dk = Ck + jUfe. 

Dk ~ Cfe, B k ~ C fc ~ yjc k \i k . 



(11.32) lim g k (y)=g(y) 

k— >oo 



y, 



y g 



Since 5 is not hyperbolic, we cannot apply Theorem |l 1.1 [ To compute 
( |11.27 ), we abbreviate 

(11.33) a*. = 1, p k = 1, 7fc = — , 

Ck l^k Ck 

and write 



limbec y^ from 



(11.34) a k -- 
We have 



1 



1 

7fe L 

0, p k ■ 



(l + 7fc)\/(l + a*)(l + A) 
0, 7fc 



0. Moreover, Ql.56Hl.58p , (IV) and flll.20p imply that (fca fc ) 
and (kPk) are asymptotically monotone and bounded. Together with lim^oo k 2 ^ k = 00 this 



in turn implies that a k j ' yp% — > and p k j 1 yp% —> 0. Hence ak — > and 6fc — >■ 1, and therefore 
( 11.271) yields = ±1, so that we ca n apply Theorem 



11.2 



To prove (1), note that (1.56-1.58), (IV) and (11.20) also imply that (y/yk) and (1/ '\/k 2 ^ k 



are asymptotically monotone and bounded. By (11.34) and (I— III) , this in turn implies that 



(a k ) and (b k ) have bounded variation. Indeed, the first equality in ( jll .34 ) can be rewritten 



as 



(11.35) a k 



1 l-(l + lk ) 2 (l + a k )(l + P k ) 



Ik 1 + (1 + 7 fc V(l + a fc )(l + /3 fc )' 



The denominator tends to 2, is Lipschitz near 2, and has bounded variation because (ak), 
(Pk), (j k ) have bounded variation. The numerator equals — ak — Pk~ ^lk plus terms that are 
products of a k , Pk and j k . Writing a k / ' yf% = ka k /\/k 2 j k and Pk/xfjk = kp k / \fk 2 j k ~ and 
using that \/k 2 jk — > 00, we therefore easily get the claim. 
To prove (2), note that 



(11.36) A k = cL = c 2 k /^(l + a k )(l + Pk), C k y+ + D k = c k (l + + Ik), 

V Ck+lHk+1 

and hence 

(11-37) n g'k(y + k ) < II tttT^TT + na ' 

The term under the product equals 

(11.38) l-2j/Vyjfc[l + o(l)], 

which yields (2) because \/k 2r y k —> 00. 
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11.3.4 Case (d) 



Let K = and L < oo. Put y k = &kXk- Then the same recursion relation as in (11.21 11.22 ) 
holds with coefficients 

(11.39) A k = c k - > ^ L , B k = c k u. k a k+ i, C k = — , D k = c k + u. k . 

Cfc <J k 

Abbreviate 

(11.40) 5, = ^-! = J-. 

Cfc C k O k 



We have k/i k /c k — > and, by (1.55), c k+ \/c k ~ 1, o k+ \/o k ~ 1 and k5 k — > 1 — a with 

Sk^O. 



a £ (— oo, 1) the exponent in (1.55). It therefore follows that 



^ A k B k 
(11.41) — 1. — - 



AifcCfc 



£>fc D k 
Hence ( |11.22| ) yields 

(11.42) lim g k (y) = g{y) = y, 

fc— >oo 



fc^fc 1 
Cfc Mfc 



0, 



1 



D k 



Ck°k 



y e 



rewrite (11.27) as 



Since 5 is not hyperbolic, we cannot apply Theorem 11.1 



To compute y ± = lim^oo y^ , we 



(H-43) yt = 
and note that 



a k ±Jal + 4fy 



with 



a k 



D k 



C k 



b k 



B k 
C k ' 



Cfc 



V-kVk 



(11.44) 



h 



Cfc+l 
Ck/J-k&k&k+l 



Ck _ kfJ>k 1 
C fc+ l Cfc fcfSfc ' 
&Vfc CJfc+i 1 



Cfc <7fc (Mfc)' 



Since h \x k jc k — > L < 00 and — )• 1 — a with a G (—00, 1) the exponent in (1.55), it follows 



that a k — > 1 and b k — > L/(l — a) . Hence y k 
can apply Theorem 11. 2[ 



so that we 



To prove (1), note that (1.56-1.58), (HV) and (11.20) imply that (a k ) and (b k ) have 
bounded variation. This yields the claim via (11.43). 
To prove (2), note that 

„2, 



(11.45) 



A k = cl^ = 4(l + 5 k ), 

Cfc 

Cfc2/fc + D k = ^ + Cfc + /ifc = Cfc ( 1 + 5 fc y+ + — 

Cfc V Cfc 



and, hence, 

(n.4 6 ) n <jm) < n TT^i- 

The term under the product equals 
(11.47) l-(2y+-l)5 fc [l + o(l)], 



Since y + > 1, it follows that (2) holds if and only if X^fceN ^ fc = 00, which by (11.9) and 
(11.40) holds if and only linifc^oo o k = 00. Theorem 11.2 shows that failure of (2) implies that 
y k converges to a limit different from 1. 
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11.4 Scaling of the volatility for exponential coefficients 



In this section, we briefly comment on how to extend the proof of Theorem 1.6 to cover the 
case of Theorem 11.71 

The claims made for Cases (A) and (B) follow from minor adaptations of the arguments 



for Cases (a) and (b) in Sections |11.3.2 and 11.3.1 The claim made for Case (CI) follows from 
Theorem |1.5[d). The claims made for Cases (C2) and (C3) follow from minor adaptations of 



the arguments for Cases (b) and (c) in Sections |1 1 .3. 1 and |1 1 .3.3 The details are left to the 
reader. 



12 Notation index 

12.1 General notation 

E ~> compact Polish space of types. 
V{E) ~> set of probability measures on E. 
M (E) ~» set of measurable functions on E. 
.M([0, 1]) ~» set of non-negative measures on [0, 1]. 
Aif([0, 1]) ~> set of finite non-negative measures on [0, 1]. 
C ~> law. 

==> ~> weak convergence on path space. 
A*€ A<([0,1])~» (cf. (p|). 



12 



A£M/([0,1])m (cf. Section |L3[). 

[So] ~> Gateaux-derivative of F with respect to Xi in the direction 5 a (cf. (1.13)). 



D(T,£) ^> set of cadlag paths in £ indexed by the elements of T C K and equipped 
with the Skorokhod Ji-topology. 

Cb {£,£') ~> set of continuous bounded mappings from £ to £'. 

2 Interacting A-Cannings processes 

SI at ~» hierarchical group of order N (cf. (1.20)). 
c = (cfc)fceNo ^ (0,oo) N ° ~> migration coefficients (cf. (1.24)). 
A = (A fc ) fceNo EM/([0, 1]) n » ^ offspring measures (cf. (|1.27|)). 



Afc = Afc([0, 1]) ~> resampling rates (cf. (1.29)). 
d = (dfc)fc6N ^ volatility constants (cf. ( 1.42| )). 



ZZi= (^fc)fceNo ( cf - (1-44)). 



2^k 



(cf. (1.44)). 
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12. 



u k <s* (cf. (fL49|». 



Bk{v) ^ fc-block around ry (cf. ( |1.22 )). 

J/rj./c ~» type distribution in B^rj) (cf. ( |1.30 )) 



G A -process ~> non-spatial continuum-mass A-Cannings process (cf. Section 1.3.1). 
cl( n \-, •) hierarchical random walk kernel on Ojy (cf- (1.25)). 



Cj\— -process ~» hierarchically interacting Cannings process on iljy (cf. Section 



1.4.4). 



L^ N \ L^l, Lres ~> generators of the mean-field Cannings process (cf. (1.11)). 



L^ N \ L^l , Lres W ^ generators of the hierarchical Cannings process (cf. (1.35)). 



r,a,B k (rj) 



reshuffling-resampling map (cf. (1.38)). 



-process (cf. Section 1.4.4b. 



(•) ~* macroscopic observables (= block averages) of x( njv ) (cf. (1.40)). 



rj,k 
[11 



2/V 



l-block averages indexed block- wise (cf. (7.18)). 



Gn,k if- level truncation of J)jv (cf. (1.39)) 



X^ N > ~* mean-field interacting Cannings process (cf. Section 1.3.2). 



Q x (du, dv) Fleming- Viot diffusion function (cf. (1.18)) 



Lg' d ' A , Lq, L d , L A generators of the McKean-Vlasov process (cf. (1.16)). 



^c,d,A ^ McKean-Vlasov process with immigration-emigration (cf. Section 1.3.3). 

c.d.A 



unique equilibrium of Z (cf. (4.1)). 



Y^l N (•) ~* macroscopic observables (= block averages) of X^ Un ^ (cf. (1.40)). 



interaction chain (cf. Section 



1.5.5). 



(M£ ' ;fc=-o+i),...,o 

3 Spatial A-coalescents 

N = {!,- •.,"}■ 

n n ~> set of all partitions of [n] into disjoint families (cf. ( |2.4[ )). 
^G,n ~> set of G-labelled partitions of [n] (cf. (2.6)). 
5c,n S nc jn G-labelled partition into singletons (cf. (|2.7|)). 



II, II G partitions of N, G-labelled partitions of N (cf. (2.10)) 



L(wg) set °f labels of partition txq (cf. (2.9)) 



A^V ^ coalescence-rates (cf. (2.13)). 
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■|n ~» operation of projection from [m] (respectively, N) onto [n]. 

^(n N )*^ L^jg '*, L^^* ^> generators of the hierarchical Cannings-coalescent (cf. (2.32)). 

*P ~> field of Poisson point processes driving the spatial A-coalescent (cf. ( 2.14[ )). 

fp(^jv) ^ driving Poisson point process for the spatial n-A-coalescent with block coales- 
cence (cf. Q2.26D). 



r(G) 



spatial n-A-coalescent on G (cf. (2.17)). 



£( G ) ~> spatial A-coalescent (cf. ( |2.19| )). 

<t^ N ' ~» spatial A-coalescent with block coalescence (cf. ( 2.30| )). 
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